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MOLECULES FOR DISEASE DETECTION AND TREATMENT 

TECHNICAL FIELD 

The present invention relates to molecules for disease detection and treatment and to the use of 
these sequences in the diagnosis, study, prevention, and treatment of diseases associated with, as well as 
effects of exogenous compounds on, the expression of molecules for disease detection and treatment. 



BACKGROUND OF THE INVENTION 

The human genome is comprised of thousands of genes, many encoding gene products that 
10 function in the maintenance and growth of the various cells and tissues in the body. Aberrant 

expression or mutations in these genes and their products is the cause of, or is associated with, a variety 
of human diseases such as cancer and other cell proliferative disorders. The identification of these 
genes and their products is the basis of an ever-expanding effort to find markers for early detection of 
diseases, and targets for their prevention and treatment 
1 5 For example, cancer represents a type of cell proliferative disorder that affects nearly every 

tissue in the body. A wide variety of molecules, either aberrantly expressed or mutated, can be the 
cause of, or involved with, various cancers because tissue growth involves complex and ordered 
patterns of cell proliferation, cell differentiation, and apoptosis. Cell proliferation must be regulated to 
maintain both the number of cells and their spatial organization. This regulation depends upon the 
2 o appropriate expression of proteins which control cell cycle progression in response to extracellular 
signals such as growth factors and other mitogens, and intracellular cues such as DNA damage or 
nutrient starvation. Molecules which directly or indirectly modulate cell cycle progression fall into 
several categories, including growth factors and their receptors, second messenger and signal 
transduction proteins, oncogene products, tumor-suppressor proteins, and mitosis-promoting factors. 

2 5 Aberrant expression or mutations in any of these gene products can result in cell proliferative disorders 

such as cancer. Oncogenes are genes generally derived from normal genes that, through abnormal 
expression or mutation, can effect the transformation of a normal cell to a malignant one (oncogenesis). 
Oncoproteins, encoded by oncogenes, can affect cell proliferation in a variety of ways and include 
growth factors, growth factor receptors, intracellular signal transducers, nuclear transcription factors, 
30 and cell-cycle control proteins. In contrast, tumor-suppressor genes are involved in inhibiting cell 
proliferatioa Mutations which cause reduced or loss of function in tumor-suppressor genes result in 
aberrant cell proliferation and cancer. Thus a wide variety of genes and their products have been found 
that are associated with cell proliferative disorders such as cancer, but many more may exist that are 
yet to be discovered 

3 5 DNA-based arrays can provide a simple way to explore the expression of a single polymorphic 
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gene or a large number of genes. When the expression of a single gene is explored, DNA-based arrays 
are employed to detect the expression of specific gene variants. For example, a p53 tumor suppressor 
gene array is used to determine whether individuals are carrying mutations that predispose them to 
cancer. A cytochrome p450 gene array is useful to determine whether individuals have one of a number 
5 of specific mutations that could result in increased drug metabolism, drug resistance or drug toxicity. 

DNA-based array technology is especially relevant for the rapid screening of expression of a 
large number of genes. There is a growing awareness that gene expression is affected in a global 
fashion. A genetic predisposition, disease or therapeutic treatment may affect, directly or indirectly, the 
expression of a large number of genes. In some cases the interactions may be expected, such as when 

10 the genes are part of the same signaling pathway. In other cases, such as when the genes participate in 
separate signaling pathways, the interactions may be totally unexpected. Therefore, DNA-based arrays 
can be used to investigate how genetic predisposition, disease, or therapeutic treatment affects the 
expression of a large number of genes. 

The discovery of new molecules for disease detection and treatment satisfies a need in the art 

15 by providing new compositions which are useful in the diagnosis, study, prevention, and treatment of 
diseases associated with, as well as effects of exogenous compounds on, the expression of molecules for 
disease detection and treatment 

SUMMARY OF THE INVENTION 

2 o The present invention relates to human disease detection and treatment molecule 

polynucleotides (mddt) as presented in the Sequence Listing. The mddt uniquely identify genes 
encoding structural, functional, and regulatory disease detection and treatment molecules. 

The invention provides an isolated polynucleotide comprising a polynucleotide sequence 
selected from the group consisting of a) a polynucleotide sequence selected from the group consisting of 

2 5 SEQ ID NO:l-45; b) a naturally occurring polynucleotide sequence having at least 90% sequence 
identity to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -45 ; c) a 
polynucleotide sequence complementary to a); d) a polynucleotide sequence complementary to b); and 
e) an RNA equivalent of a) through d). In one alternative, the polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:l-45. In another alternative, 

30 the polynucleotide comprises at least 60 contiguous nucleotides of a polynucleotide sequence selected 
from the group consisting of a) a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:l-45; b) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-45; c) a polynucleotide 
sequence complementary to a); d) a 
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polynucleotide sequence complementary to b); and e) an RNA equivalent of a) through d). The 
invention further provides a composition for the detection of expression of disease detection and 
treatment molecule polynucleotides comprising at least one isolated polynucleotide comprising a 
polynucleotide sequence selected from the group consisting of a) a polynucleotide sequence selected 
5 from the group consisting of SEQ ID NO: 1 -45 ; b) a naturally occurring polynucleotide sequence having 
at least 90% sequence identity to a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:l-45; c) a polynucleotide sequence complementary to a); d) a polynucleotide sequence 
complementary to b) ; and e) an RNA equivalent of a) through d); and a detectable label. 

The invention also provides a method for detecting a target polynucleotide in a sample, said 

10 target polynucleotide comprising a polynucleotide sequence selected from the group consisting of a) a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -45 ; b) a naturally 
occurring polynucleotide sequence having at least 90% sequence identity to a polynucleotide sequence 
selected from the group consisting of SEQ ID NO:l 45; c) a polynucleotide sequence complementary to. : 
a); d) a polynucleotide sequence complementary to b); and e) an RNA equivalent of a) through d). The 

15 method comprises a) amplifying said target polynucleotide or a fragment thereof using polymerase 
chain reaction amplification, and b) detecting the presence or absence of said amplified target 
polynucleotide or fragment thereof, and, optionally, if present, the amount thereof . 

The invention also provides a method for detecting a target polynucleotide in a sample, said 
target polynucleotide comprising a polynucleotide sequence selected from the group consisting of a) a 

20 polynucleotide sequence selected from the group consisting of SEQ ID NO:l-45; b) a naturally 

occurring polynucleotide sequence having at least 90% sequence identity to a polynucleotide sequence 
selected from the group consisting of SEQ ID NO:l-45; c) a polynucleotide sequence complementary to 
a); d) a polynucleotide sequence complementary to b); and e) an RNA equivalent of a) through d). The 
method comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 

2 5 comprising a sequence complementary to said target polynucleotide in the sample, and which probe 

specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex 
is formed between said probe and said target polynucleotide, and b) detecting the presence or absence of 
said hybridization complex, and, optionally, if present, the amount thereof. In one alternative, the probe 
comprises at least 30 contiguous nucleotides. In another alternative, the probe comprises at least 60 

3 o contiguous nucleotides. 

. The invention further provides a recombinant polynucleotide comprising a promoter sequence 
cperably linked to an isolated polynucleotide comprising a polynucleotide sequence selected from the 
group consisting of a) a polynucleotide sequence selected from the group consisting of SEQ ID NO:l- 
45; b) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 

3 
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polynucleotide sequence selected from the group consisting of SEQ ID NO:l-45 ; c) a polynucleotide 
sequence complementary to a); d) a polynucleotide sequence complementary to b); and e) an RNA 
equivalent of a) through d). In one alternative, the invention provides a ceil transformed with the 
recombinant polynucleotide. In another alternative, the invention provides a transgenic organism 
5 comprising the recombinant polynucleotide. In a further alternative, the invention provides a method 
for producing a disease detection and treatment molecule polypeptide, the method comprising a) 
culturing a cell under conditions suitable for expression of the disease detection and treatment molecule 
polypeptide, wherein said cell is transformed with the recombinant polynucleotide, and b) recovering the 
disease detection and treatment molecule polypeptide so expressed 

1 o The invention also provides a purified disease detection and treatment molecule polypeptide 

(MDDT) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO: 1-45. Additionally, the invention provides an isolated antibody 
which specifically binds to the disease detection and treatment molecule polypeptide. The invention 
further provides a method of identifying a test compound which specifically binds to the disease 
1 5 detection and treatment molecule polypeptide, the method comprising the steps of a) providing a test 
compound; b) combining the disease detection and treatment molecule polypeptide with the test 
compound for a sufficient time and under suitable conditions for binding; and c) detecting binding of the 
disease detection and treatment molecule polypeptide to the test compound, thereby identifying the test 
compound which specifically binds the disease detection and treatment molecule polypeptide. 

2 o The invention further provides a microarray wherein at least one element of the microarray is 

an isolated polynucleotide comprising at least 60 contiguous nucleotides of a polynucleotide comprising 
a polynucleotide sequence selected from the group consisting of a) a polynucleotide sequence selected 
from the group consisting of SEQ ID NO: 1-45; b) a naturally occurring polynucleotide sequence having 
at least 90% sequence identity to a polynucleotide sequence selected from the group consisting of SEQ 

25 ID NOrl-45; c) a polynucleotide sequence complementary to a); d) a polynucleotide sequence 

complementary to b); and e) an RNA equivalent of a) through d). The invention also provides a method 
for generating a transcript image of a sample which contains polynucleotides. The method comprises a) 
labeling the polynucleotides of the sample, b) contacting the dements of the microarray with the labeled 
polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, 

30 and c) quantifying the expression of the polynucleotides in the sample. 

Additionally, the invention provides a method for screening a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of a) a polynucleotide sequence selected 
from the group consisting of SEQ ID NO:l-45; b) a naturally occurring polynucleotide sequence having 
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at least 90% sequence identity to a polynucleotide sequence selected from the group consisting of SEQ 
ID NO: 1-45; c) a polynucleotide sequence complementary to a); d) a polynucleotide sequence 
complementary to b); and e) an RNA equivalent of a) through d). The method comprises a) exposing a 
sample comprising the target polynucleotide to a compound, and b) detecting altered expression of the 
5 target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of 
varying amounts of the compound and in the absence of the compound. 

The invention further provides a method for assessing toxicity of a test compound, said method 
comprising a) treating a biological sample containing nucleic acids with the test compound; b) 
hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 

1 o contiguous nucleotides of a polynucleotide comprising a polynucleotide sequence selected from the 

group consisting of i) a polynucleotide sequence selected from the group consisting of SEQ ID NO:l- 
45; ii) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:l-45; iii) a polynucleotide 
sequence complementary to i), iv) a polynucleotide sequence complementary to ii), and v) an RNA 
15 equivalent of i)-iv). Hybridization occurs under conditions whereby a specific hybridization complex 
is formed between said probe and a target polynucleotide in the biological sample, said target 
polynucleotide comprising a polynucleotide sequence selected from the group consisting of i) a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-45; ii) a naturally 
occurring polynucleotide sequence having at least 90% sequence identity to a polynucleotide 

2 o sequence selected from the group consisting of SEQ ID NO: 1-45 ; iii) a polynucleotide sequence 

complementary to i), iv) a polynucleotide sequence complementary to ii), and v) an RNA equivalent 
of i)-iv), and alternatively, the target polynucleotide comprises a fragment of a polynucleotide sequence 
selected from the group consisting of i)-v) above; c) quantifying the amount of hybridization complex; 
and d) comparing the amount of hybridization complex in the treated biological sample with the amount 

25 of hybridization complex in an untreated biological sample, wherein a difference in the amount of 
hybridization complex in the treated biological sample is indicative of toxicity of the test compound. 

The invention further provides an isolated polypeptide comprising an amino acid sequence 
selected from the group consisting of a) an amino acid sequence selected from the group consisting of 
SEQ ID NO:46-90, b) a naturally occurring amino acid sequence having at least 90% sequence identity 

30 to an amino acid sequence selected from the group consisting of SEQ ID NO:46-90, c) a biologically 
active fragment of an amino acid sequence selected from the group consisting of SEQ ID NO:46-90, 
and d) an immunogenic fragment of an amino acid sequence selected from the group consisting of SEQ 
ID NO:46-90. In one alternative, the invention provides an isolated polypeptide comprising the amino 
acid sequence of SEQ ID NO:46-90. 



5 
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DESCRIPTION OF THE TABLES 

Table 1 shows the sequence identification numbers (SEQ ID NO:s) and template identification 
numbers (template IDs) corresponding to the polynucleotides of the present invention, along with their 
5 GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the 
GenBankhits. 

Table 2 shows the sequence identification numbers (SEQ ID NO:s) and template identification 
numbers (template IDs) corresponding io the polynucleotides of the present invention, along with 
polynucleotide segments of each template sequence as defined by the indicated "start" and "stop" 

1 o nucleotide positions. The reading frames of the polynucleotide segments and the Pf am hits, Pf am 

descriptions, and E-values corresponding to the polypeptide domains encoded by the polynucleotide 
segments are indicated. 

Table 3 shows the sequence identification numbers (SEQ ID NO:s) and template identification 
numbers (template IDs) corresponding to the polynucleotides of the present invention, along with 
1 5 polynucleotide segments of each template sequence as defined by the indicated "start" and "stop" 
nucleotide positions. The reading frames of the polynucleotide segments are shown, and the 
polypeptides encoded by the polynucleotide segments constitute either signal peptide (SP) or 
transmembrane (TM) domains, as indicated. The membrane topology of the encoded polypeptide 
sequence is indicated, the N-terminus (N) listed as being oriented to either the cytosolic (in) or non- 

2 o cytosolic (out) side of the cell membrane or organelle. 

Table 4 shows the sequence identification numbers (SEQ ID NO:s) corresponding to the 
polynucleotides of the present invention, along with component sequence identification numbers 
(component IDs) corresponding to each template. The component sequences, which were used to 
assemble the template sequences, are defined by the indicated "start" and "stop" nucleotide positions 

2 5 along each template. 

Table 5 shows the tissue distribution profiles for the templates of the inventioa 
Table 6 shows the sequence identification numbers (SEQ ID NO:s) corresponding to the 
polypeptides of the present invention, along with the reading frames used to obtain the polypeptide 
segments, the lengths of the polypeptide segments, the "start" and "stop" nucleotide positions of the 

3 o polynucleotide sequences used to define the encoded polypeptide segments, the GenBank hits (GI 

Numbers), probability scores, and functional annotations corresponding to the GenBank hits. 

Table 7 summarizes thebioinformatics tools which are useful for analysis of the 
polynucleotides of the present invention. The first column of Table 7 lists analytical tools, programs, 
and algorithms, the second column provides brief descriptions thereof, the third column presents 
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appropriate references, all of which are incorporated by reference herein in their entirety, and the fourth 
column presents, whore applicable, the scores, probability values, and other parameters used to evaluate 
the strength of a match between two sequences (the higher the score, the greater the homology between 
two sequences), 

5 

DETAILED DESCRIPTION OF THE INVENTION 

Before the nucleic acid sequences and methods are presented, it is to be understood that this 
invention is not limited to the particular machines, methods, and materials described. Although 
particular embodiments are described, machines, methods, and materials similar or equivalent to these 
l o embodiments may be used to practice the invention. The preferred machines, methods, and materials 
set forth are not intended to limit the scope of the invention which is limited only by the appended 
claims. 

The singular forms "a", "an", and 'the" include plural reference unless the context clearly 
dictates otherwise. All technical and scientific terms have the meanings commonly understood by one 
15 of ordinary skill in the art. All publications are incorporated by reference for the purpose of describing 
and disclosing the cell lines, vectors, and methodologies which are presented and which might be used in 
connection with the invention. Nothing in the specification is to be construed as an admission that the 
invention is not entitled to antedate such disclosure by virtue of prior invention. 

20 Definitions 

As used herein, the Iowa case "mddt" refers to a nucleic acid sequence, while the upper case 
"MDDT" refers to an amino acid sequence encoded by mddt. A "full-length" mddt refers to a nucleic 
acid sequence containing the entire coding region of a gene endogenously expressed in human tissue. 

"Adjuvants" are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and 
25 surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole 
limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's immunological 
response. 

"Allele" refers to an alternative form of a nucleic acid sequence. Alleles result from a 
"mutation," a change or an alternative reading of the genetic code. Any given gene may have none, one, 
30 or many allelic forms. Mutations which give rise to alleles include deletions, additions, or substitutions 
of nucleotides. Each of these changes may occur alone, or in combination with the others, one or more 
times in a given nucleic acid sequence. The present invention encompasses allelic mddt. 

"Amino acid sequence" refers to a peptide, a polypeptide, or a protein of either natural or 
synthetic origin. The amino acid sequence is not limited to the complete, endogenous amino acid 
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sequence and may be a fragment, epitope, variant, or derivative of a protein expressed by a nucleic acid 
sequence. 

"Ansplification" refers to the production of additional copies of a sequence and is carried out 
using polymerase chain reaction (PCR) technologies well known in the art. 
5 "Antibody" refers to intact molecules as well as to fragments thereof, such as Fab, F(ab')2> and 

Fv fragments, which are capable of binding the epitopic determinant. Antibodies that bind MDDT 
polypeptides can be prepared using intact polypeptides or using fragments containing small peptides of 
interest as the immunizing antigen. The polypeptide or peptide used to immunize an animal (e.g„ a 
mouse, a rat, or a rabbit) can be derived from the translation of RNA, or synthesized chemically, and 
10 can be conjugated to a carrier protein if desired Commonly used carriers that are chemically coupled 
to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin (KLH). The 
coupled peptide is then used to immunize the animal. 

"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target 
sequence. The antisense sequence may include DNA, RNA, or any nucleic acid mimic or analog such 
15 as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as 

phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified 
sugar groups such as 2'-methoxyethyl sugars or 2-methoxyethoxy sugars; or oligonucleotides having 
modified bases such as 5-methyl cytosine, 2 f -deoxyuracil, or 7-deaza-2'-deoxyguanosine. 

"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target 
2 o sequence. The antisense sequence can be DNA, RNA, or any nucleic acid mimic or analog. 

"Antisense technology" refers to any technology which relies on the specific hybridization of an 
antisense sequence to a target sequence. 

A "bin" is a portion of computer memory space used by a computer program for storage of 
data, and bounded in such a manner that data stored in a bin may be retrieved by the program. 

2 5 "Biologically active" refers to an amino acid sequence having a structural, regulatory, or 

biochemical function of a naturally occurring amino acid sequence. 

"Clone joining" is a process for combining gene bins based upon the bins' containing sequence 
information from the same clone. The sequences may assemble into a primary gene transcript as well 
as one or more splice variants. 

3 o "Complementary" describes the relationship between two single-stranded nucleic acid 

sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its complement 3-T-C-A-5'). 

A "component sequence" is a nucleic acid sequence selected by a computer program such as 
PHRED and used to assemble a consensus or template sequence from one or more component 
sequences. 
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A "consensus sequence" or 'template sequence" is a nucleic acid sequence which has been 
assembled from overlapping sequences, using a computer program for fragment assembly such as the 
GELVIEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a 
relational database management system (RDMS). 
5 "Conservative amino acid substitutions" are those substitutions that, when made, least interfere 

with the properties of the original protein, i.e., the structure and especially the function of the protein is 
conserved and not significantly changed by such substitutions. The table below shows amino acids 
which may be substituted for an original amino acid in a protein and which are regarded as conservative 
substitutions. 

10 



Original Residue Conservative Substitution 





Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 


15 


Asp 


Asn, Glu 




Cys 


Ala, Ser 




Gin 


Asn, Glu, His 




Glu 


Asp, Gin, His 




Gly 


Ala 


20 


His 


Asn, Arg, Gin, Glu 




He 


Leu, Val 




Leu 


lie, Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, lie 


25 


Phe 


His, Met, Leu, Trp, Tyr 




Ser 


Cys, Thr 




Thr 


Ser, Val 




Trp 


Phe, Tyr 




Tyr 


His, Phe, Trp 


30 


Val 


lie, Leu, Thr 



Conservative substitutions generally maintain (a) the structure of the polypeptide backbone in 

the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or 

3 5 hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. 

"Deletion" refers to a change in either a nucleic or amino acid sequence in which at least one 

i 

nucleotide or amino acid residue, respectively, is absent 

"Derivative" refers to the chemical modification of a nucleic acid sequence, such as by 
replacement of hydrogen by an alkyl, acyl, amino, hydroxyl, or other group. 
40 The terms "element" and "array element" refer to a polynucleotide, polypeptide, or other 

chemical compound having a unique and defined position on a microarray. 
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"E-value" refears to the statistical probability that a match between two sequences occurred by 

chance. 

A "fragment" is a unique portion of mddt or MDDT which is identical in sequence to but 
shorter in length than the parent sequence. A fragment may comprise up to the entire length of the 
defined sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise 
from 10 to 1000 contiguous amino acid residues or nucleotides. A fragment used as a probe, prima:, 
antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 
60, 75, 100, 150, 250 or at least 500 contiguous amino acid residues or nucleotides in length. 
Fragments may be preferentially selected from certain regions of a molecule. For example, a 
polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 
250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined sequence. 
Clearly these lengths are exemplary, and any length that is supported by the specification, including the 
Sequence Listing and the figures, may be encompassed by the present embodiments. 

A fragment of mddt comprises a region of unique polynucleotide sequence that specifically 
identifies mddt, for example, as distinct from any other sequence in the same genome. A fragment of 
mddt is useful, for example, in hybridization and amplification technologies and in analogous methods 
that distinguish mddt from related polynucleotide sequences. The precise length of a fragment of mddt 
and the region of mddt to which the fragment corresponds are routinely determinable by one of ordinary 
skill in the art based on the intended purpose for the fragment. 

A fragment of MDDT is encoded by a fragment of mddt. A fragment of MDDT comprises a 
region of unique amino acid sequence that specifically identifies MDDT. For example, a fragment of 
MDDT is useful as an immunogenic peptide for the development of antibodies that specifically 
recognize MDDT. The precise length of a fragment of MDDT and the region of MDDT to which the 
fragment corresponds are routinely determinable by one of ordinary skill in the art based on the intended 
purpose for the fr agment. 

A "full length" nucleotide sequence is one containing at least a start site for translation to a 
protein sequence, followed by an open reading frame and a stop site, and encoding a "full length" 
polypeptide. 

"Hit" refers to a sequence whose annotation will be used to describe a given template. Criteria 
for selecting the top hit are as follows: if the template has one or more exact nucleic acid matches, the 
top hit is the exact match with highest percent identity. If the template has no exact matches but has 
significant protein hits, the top hit is the protein hit with the lowest E-value. If the template has no 
significant protein hits, but does have significant non-exact nucleotide hits, the top hit is the nucleotide 
hit with the lowest E-value. 
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"Homology" refers to sequence similarity either between a reference nucleic acid sequence and 
at least a fragment of anmddt or between a reference amino acid sequence and a fragment of an 
MDDT. 

'Hybridization" refers to the process by which a strand of nucleotides anneals with a 
5 complementary strand through base pairing. Specific hybridization is an indication that two nucleic 
acid sequences share a high degree of identity. Specific hybridization complexes form under defined 
annealing conditions, and remain hybridized after the "washing" step. The defined hybridization 
conditions include the annealing conditions and the washing step(s), the latter of which is particularly 
important in determining the stringency of the hybridization process, with more stringent conditions 

10 allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not 
perfectly matched. Permissive conditions for annealing of nucleic acid sequences are routinely 
determinable and may be consistent among hybridization experiments, whereas wash conditions may be 
varied among experiments to achieve the desired stringency. 

Generally, stringency of hybridization is expressed with reference to the temperature under 

15 which the wash step is carried out. Generally, such wash temperatures are selected to be about 5°C to 
20°C Iowa: than the thermal melting point (TJ for the specific sequence at a defined ionic strength and 
pH. The T m is the temperature (under defined ionic strength and pH) at which 50% of the target 
sequence hybridizes to a perfectly matched probe. An equation for calculating T m and conditions for 
nucleic acid hybridization is well known and can be found in Sambrook et al., 1989, Molecular 

20 Cloning: A Laboratory Manual . 2 nd ed, vol. 1-3, Cold Spring Harbor Press, Plainview NY; specifically 
see volume 2, chapter 9. 

High stringency conditions for hybridization between polynucleotides of the present invention 
include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Alternatively, temperatures of about 65°C, 60°C, or 55°C may be used. SSC concentration may be 

25 varied from about 0.2 to 2 x SSC, with SDS being present at about 0.1%. Typically, blocking reagents 
are used to block non-specific hybridization. Such blocking reagents include, for instance, denatured 
salmon sperm DNA at about 100-200 ng/ml. Useful variations on these conditions will be readily 
apparent to those skilled in the art. Hybridization, particularly undo: high stringency conditions, may 
be suggestive of evolutionary similarity between the nucleotides. Such similarity is strongly indicative 

30 of a similar role for the nucleotides and their resultant proteins. 

Other parameters, such as temperature, salt concentration, and detergent concentration may be 
varied to achieve the desired stringency. Denaturants, such as formamide at a concentration of about 
35-50% v/v, may also be used undo- particular circumstances, such as RNA.DNA hybridizations. 
Appropriate hybridization conditions are routinely determinable by one of ordinary skill in the art. 

11 
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"Immunogenic" describes the potential for a natural, recombinant, or synthetic peptide, epitope, 
polypeptide* or protein to induce antibody production in appropriate animals, ceils, or cell lines. 

"Insertion" or "addition" refers to a change in either a nucleic or amino acid sequence in which 
at least one nucleotide or residue, respectively, is added to the sequence. 
5 labeling" refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or 

antibody with a reporter molecule capable of producing a detectable or measurable signal. 

"Microarray" is any arrangement of nucleic acids, amino acids, antibodies, etc., on a substrate. 
The substrate may be a solid support such as beads, glass, papa:, nitrocellulose, nylon, or an 
appropriate membrane. 

1 o 'linkers" are short stretches of nucleotide sequence which may be added to a vector or an mddt 

to create restriction endonuclease sites to facilitate cloning. "Polylinkers" are engineered to incorporate 
multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 3' overhangs 
(e.g., BamHI, EcoRI, and Hindlll) and those which provide blunt ends (e.g., EcoRV, SnaBI, and StuI). 

"Naturally occurring" refers to an endogenous polynucleotide or polypeptide that may be 
1 5 isolated from viruses or prokaryotic or eukaryotic cells. 

"Nucleic acid sequence" refers to the specific order of nucleotides joined by phosphodiester 
bonds in a linear, polymeric arrangement. Depending on the number of nucleotides, the nucleic acid 
sequence can be considered an oligomer, oligonucleotide, or polynucleotide. The nucleic acid can be 
DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be 

2 o either double-stranded or single-stranded, and can represent either the sense or antisense 

(complementary) strand. 

"Oligomer" refers to a nucleic acid sequence of at least about 6 nucleotides and as many as 
about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 and 
30 nucleotides, that may be used in hybridization or amplification technologies. Oligomers may be used 

2 5 as, e;g., primers for PCR, and are usually chemically synthesized. 

"Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a 
functional relationship with the second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promote affects the transcription or expression of the coding 
sequence. Generally, operably linked DNA sequences may be in close proximity or contiguous and, 

3 o where necessary to join two protein coding regions, in the same reading frame. 

"Peptide nucleic acid" (PNA) refers to a DNA mimic in which nucleotide bases are attached to 
a pseudopeptide backbone to increase stability. PNAs, also designated antigene agents, can prevent 
gene expression by targeting complementary messenger RNA 
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The phrases "percent identity" and "% identity", as applied to polynucleotide sequences, refer 
to the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in 
the sequences being compared in order to optimize alignment between two sequences, and therefore 
5 achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e sequence 
alignment program. This program is part of the LASERGENE software package, a suite of molecular 
biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is described in Higgins, D.G. 

10 and Sharp, P.M. (1989) CABIOS 5:151-153 and in Higgins, D.G. et al. (1992) CABIOS 8:189-191. 
For pairwise alignments of polynucleotide sequences, the default parameters are set as follows: 
Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The "weighted" residue weight table is 
selected as the default Percent identity is reported by CLUSTAL V as the "percent similarity" between 
aligned polynucleotide sequence pairs. 

1 5 Alternatively, a suite of commonly used and freely available sequence comparison algorithms is 

provided by the National Centex for Biotechnology Information (NCBI) Basic Local Alignment Search 
Tool (BLAST) (Altschul, S.F. et al. (1990) J. MoL Biol. 215:403-410), which is available from several 
sources, including the NCBI, Bethesda, MD, and on the Internet at 

httpy/www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis 
2 o programs including "blastn," that is used to determine alignment between a known polynucleotide 
sequence and other sequences on a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at httpy/www.ncbi.nlm.nih.gov/gorf/bl2/. The 
"BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 

2 5 programs are commonly used with gap and other parameters set to default settings. For example, to 

compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
2.0.9 (May-07-1999) set at default parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 

Reward for match: 1 

3 o Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 
Gap x drop-off: 50 
Expect: 10 
Word Size: 11 
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Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, as 
defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over 
the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at 
5 least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides. Such 
lengths are exemplary only, and it is understood that any fragment length supported by the sequences 
shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage 
identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 

1 o similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in 

nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences 
that all encode substantially the same protein. 

The phrases "percent identity" and "% identity", as applied to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aligned using a 

1 5 standardized algorithm. Methods of polypeptide sequence alignment are well-knowa Some alignment 
methods take into account conservative amino acid substitutions. Such conservative substitutions, 
explained in more detail above, generally preserve the hydrophobicity and acidity of the substituted 
residue, thus preserving the structure (and therefore function) of the folded polypeptide. 

Percent identity between polypeptide sequences may be determined using the default parameters 

20 of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3. 12e sequence alignment 
program (described and referenced above). For pairwise alignments of polypeptide sequences using 
CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap penalty=3, window=5, and 
"diagonals saved"=:5. The PAM250 matrix is selected as the default residue weight table. As with 
polynucleotide alignments, the percent identity is reported by CLUSTAL V as the "percent similarity" 

2 5 between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 2.0.9 
(May-07-1999) with blastp set at default parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 
30 Open Gap: 11 and Extension Gap: 1 penalty 

Gap x drop-off: 50 

Expect: 10 

Word Size: 3 

Filter: on 
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Percent identity may be measured over the length of an entire defined polypeptide sequence, for 
example, as defined by a particular SEQ ID number, or may be measured ova: a shorter length, for 
©cample, ova- the length of a fragment taken from a larger, defined polypeptide sequence, for instance, 
a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 
contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length 
supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a 
length over which percentage identity may be measured. 

*Tost-translational modification" of an MDDT may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the 
art These processes may occur synthetically or biochemically. Biochemical modifications will vary by 
cell type depending on the enzymatic milieu and the MDDT. 

"Probe" refers to mddt or fragments thereof, which are used to detect identical, allelic or related 
nucleic acid sequences. Probes are isolated oligonucleotides or polynucleotides attached to a detectable 
label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent 
agents, and enzymes. 'Trimers" are short nucleic acids, usually DNA oligonucleotides, which may be 
annealed to a target polynucleotide by complementary base-pairing. The primer may then be extended 
along the target DNA strand by a DNA polymerase enzyme. Prima" pairs can be used for amplification 
(and identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 30, 40, 50, 60, 70, 80, 90, 100, or 
at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may 
be considerably longer than these examples, and it is understood that any length supported by the 
specification, including the figures and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in the references, for 
example Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual 2 nd e&, vol. 1-3, Cold 
Spring Harbor Press, Plainview NY; Ausubel et al.,1987, Current Protocols in Molecular Biology . 
Greene Publ. Assoc. & Wiley-Intersciences, New York NY; Innis et al., 1990, PCR Protocols, A Guide 
to Methods and Applications . Academic Press, San Diego CA PCR primer pairs can be derived from 
a known sequence, for example, by using computer programs intended for that purpose such as Primer 
(Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 
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nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection 
programs have incorporated additional features for expanded capabilities. For example, the PrimOU 
primer selection program (available to the public from the Genome Center at University of Texas South 
West Medical Cento:, Dallas TX) is capable of choosing specific primers from megabase sequences 
5 and is thus useful for designing primers on a genome- wide scope. The Primer3 primer selection 
program (available to the public from the Whitehead Institute/MIT Cento- for Genome Research, 
Cambridge MA) allows the user to input a "misprinting library," in which sequences to avoid as primer 
binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for 
microarrays. (The source code for the latter two primer selection programs may also be obtained from 

l o their respective sources and modified to meet the user's specific needs.) The PrimeGen program 
(available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge 
UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that 
hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences. 
Hence, this program is useful for identification of both unique and conserved oligonucleotides and 

is polynucleotide fragments. The oligonucleotides and polynucleotide fragments identified by any of the 
above selection methods are useful in hybridization technologies, for example, as PCR or sequencing 
primers, microarray elements, or specific probes to identify fully or partially complementary 
polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are not limited to 
those described above. 

2 o "Purified" refers to molecules, either polynucleotides or polypeptides that are isolated or 

separated from their natural environment and are at least 60% free, preferably at least 75 % free, and 
most preferably at least 90% free from other compounds with which they are naturally associated. 

A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence 
that is made by an artificial combination of two or more otherwise separated segments of sequence. 

2 5 This artificial combination is often accomplished by chemical synthesis or, more commonly, by the 

artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques 
such as those described in Sambrook, supra . The term recombinant includes nucleic acids that have 
been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a 
recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. 

3 o Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 

"Regulatory element* * refers to a nucleic acid sequence from nontranslated regions of a gene, 
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and includes enhancers, promoters, introns, and 3' untranslated regions, which interact with host 
proteins to carry out or regulate transcription or translatioa 

"Reporter" molecules are chemical or biochemical moieties used for labeling a nucleic acid, an 
amino acid, or an antibody. They include radionuclides; enzymes; fluorescent, chemiluminescent, or 
5 chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known in 
the art. 

An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear 
sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 

1 o instead of deoxyribose. 

"Sample" is used in its broadest sense. Samples may contain nucleic or amino acids, 
antibodies, or other materials, and may be derived from any source (e.g., bodily fluids including, but not 
limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; 
genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots 
15 or imprints from such cells or tissues). 

"Specific binding" or "specifically binding" refers to the interaction between a protein or 
peptide and its agonist, antibody, antagonist, or other binding partner. The interaction is dependent 
upon the presence of a particular structure of the protein, e.g., the antigenic determinant or epitope, 
recognized by the binding molecule. For example, if an antibody is specific for epitope "A," the 

2 o presence of a polypeptide containing epitope A, or the presence of free unlabeled A, in a reaction 

containing free labeled A and the antibody will reduce the amount of labeled A that binds to the 
antibody. 

"Substitution" refers to the replacement of at least one nucleotide or amino acid by a different 
nucleotide or amino acid. 

2 5 "Substrate" refers to any suitable rigid or semi-rigid support including, e.g., membranes, filters, 

chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles or capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A "transcript image" refers to the collective pattern of gene expression by a particular tissue or 

3 o cell type under given conditions at a given time. 

'Transformation" refers to a process by which exogenous DNA enters a recipient cell. 
Transformation may occur under natural or artificial conditions using various methods well known in 
the ait Transformation may rely on any known method for the insertion of foreign nucleic acid 
sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host cell being 
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transformed. 

4 *Transformaiits" include stably transformed cells in which the inserted DNA is capable of 
replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as 
cells which transiently express inserted DNA oar RN A. 
5 A "transgenic organism," as used herein, is any organism, including but not limited to animals 

and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 
introduced by way of human intervention, such as by transgenic techniques well known in the art. The 
nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, 
by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant 

10 virus. The term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, 
but rather is directed to the introduction of a recombinant DNA molecule. The transgenic organisms 
contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, and plants 
and animals. The isolated DNA of the present invention can be introduced into the host by methods 
known in the art, for example infection, transfection, transformation or transconjugation. Techniques 

1 5 for transferring the DNA of the present invention into such organisms are widely known and provided in 
references such as Sambrook et al. (1989), supra . 

A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having at 
least 25 % sequence identity to the particular nucleic acid sequence ova: a certain length of one of the 
nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) 

20 set at default parameters. Such a pair of nucleic acids may show, for example, at least 30%, at least 
50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95% or even at least 98% or 
greater sequence identity over a certain defined length. The variant may result in "conservative" amino 
acid changes which do not affect structural and/or chemical properties. f A variant may be described as, 
for example, an "allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice 

2 5 variant may have significant identity to a reference molecule, but will generally have a greater or lesser 

number of polynucleotides due to alternate splicing of exons during mRN A processing. The 
corresponding polypeptide may possess additional functional domains or lack domains that are present 
in the reference molecule. Species variants are polynucleotide sequences that vary from one species to 
another. The resulting polypeptides generally will have significant 

3 o amino acid identity relative to each other. A polymorphic variant is a variation in the polynucleotide 

sequence of a particular gene between individuals of a given species. Polymorphic variants also may 
encompass "single nucleotide polymorphisms" (SNPs) in which the polynucleotide sequence varies by 
one base. The presence of SNPs may be indicative of, for example, a certain population, a disease 
state, or a propensity for a disease state. 
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In an alternative, variants of the polynucleotides of the present invention may be generated 
through recombinant methods. One possible method is a DNA shuffling technique such as 
MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 
5,837,458; Chang, C.-C. et al. (1999) Nat Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat Biotechnol. 14:315-319) to alter or improve 
the biological properties of MDDT, such as its biological or enzymatic activity or its ability to bind to 
other molecules or compounds. DNA shuffling is a process by which a library of gene variants is 
produced using PCR-mediated recombination of gene fragments. The library is then subjected to 
selection or screening procedures that identify those gene variants with the desired properties. These 
preferred variants may then be pooled and further subjected to recursive rounds of DNA shuffling and 
selection/screening. Thus, genetic diversity is created through "artificial" breeding and rapid molecular 
evolution. For example, fragments of a single gene containing random point mutations may be 
recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, 
fragments of a given gene may be recombined with fragments of homologous genes in the same gene 
family, either from the same or different species, thereby maximizing the genetic diversity of multiple 
naturally occurring genes in a directed and controllable manner. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 
at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 98% or greater sequence 
identity ova: a certain defined length of one of the polypeptides. 

THE INVENTION 

In a particular embodiment, cDNA sequences derived from human tissues and cell lines were 
aligned based on nucleotide sequence identity and assembled into "consensus" or "template" sequences 
which are designated by the template identification numbers (template IDs) in column 2 of Table 1 . 
The sequence identification numbers (SEQ ID NO:s) corresponding to the template IDs are shown in 
column 1. The template sequences have similarity to GenBank sequences, or "hits," as designated by 
the GI Numbers in column 3. The statistical probability of each GenBank hit is indicated by a 
probability score in column 4, and the functional annotation corresponding to each GenBank hit is listed 
in column 5. 

The invention incorporates the nucleic acid sequences of these templates as disclosed in the 
Sequence Listing and the use of these sequences in the diagnosis and treatment of disease states 
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characterized by defects in disease detection and treatment molecules. The invention further utilizes 
these sequences in hybridization and amplification technologies, and in particular, in technologies which 
assess gene expression patterns correlated with specific cells or tissues and their responses in vivo or in 
vitro to pharmaceutical agents, toxins, and other treatments. In this manner, the sequences of the 
5 present invention are used to develop a transcript image for a particular cell or tissue. 

Derivation of Nucleic Acid Sequences 

cDNA was isolated from libraries constructed using RNA derived from normal and diseased 
human tissues and cell lines. The human tissues and cell lines used for cDNA library construction were 

1 o selected from a broad range of sources to provide a diverse population of cDN As representative of gene 

transcription throughout the human body. Descriptions of the human tissues and cell lines used for 
cDNA library construction are provided in the LIFESEQ database (Incyte Genomics, Inc. (Incyte), Palo 
Alto CA). Human tissues were broadly selected from, for example, cardiovascular, dermatologic, 
endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, reproductive, and 
15 urologic sources. 

Cell lines used for cDNA library construction were derived from, for example, leukemic cells, 
teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial cells. Such 
cell lines include, for example, THP-1, Jurkat, HUVEC, hNT2, WI38, HeLa, and other cell lines 
commonly used and available from public depositories (American Type Culture Collection, Manassas 

2 o VA). Prior to mRNA isolation, cell lines were untreated, treated with a pharmaceutical agent such as 

5 -aza-2 -deoxycytidine, treated with an activating agent such as lipopolysaccharide in the case of 
leukocytic cell lines, or, in the case of endothelial ceil lines, subjected to shear stress. 

Sequencing of the cDNAs 

2 5 Methods for DNA sequencing are well known in the art. Conventional enzymatic methods 

employ the Klenow fragment of DNA polymerase I, SEQUENASE DNA polymerase (U.S. 
Biochemical Corporation, Cleveland OH), Taq polymerase (Applied Biosystems, Foster City CA), 
thermostable T7 polymerase (Amersham Pharmacia Biotech, Inc. (Amersham Pharmacia Biotech), 
Piscataway NJ), or combinations of polymerases and proofreading exonucleases such as those found in 

30 the ELONGASE amplification system (Life Technologies Inc. (Life Technologies), Gaithersburg MD), 
to extend the nucleic acid sequence from an oligonucleotide primer annealed to the DNA template of 
interest. Methods have been developed for the use of both single-stranded and double-stranded 
templates. Chain termination reaction products may be dectrophoresed on urea-polyacrylamide gels 
and detected either by autoradiography (for radioisotope-labeied nucleotides) or by fluorescence (for 

20 
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fluorophore-labeled nucleotides). Automated methods for mechanized reaction preparation, sequencing, 
and analysis using fluorescence detection methods have been developed. Machines used to prepare 
cDNAs for sequencing can include the MICROLAB 2200 liquid transfer system (Hamilton Company 
(Hamilton), Reno NV), Peltier thermal cycler (PTC200; MJ Research, Inc. (MJ Research), Watertown 
MA), and ABI CATALYST 800 thermal cycler (Applied Biosystems). Sequencing can be carried out 
using, for example, the ABI 373 or 377 (Applied Biosystems) or MEGABACE 1000 (Molecular 
Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA) DNA sequencing systems, or other automated 
and manual sequencing systems well known in the art. 

The nucleotide sequences of the Sequence Listing have been prepared by current, state-of-the- 
art, automated methods and, as such, may contain occasional sequencing errors or unidentified 
nucleotides. Such unidentified nucleotides are designated by an N. These infrequent unidentified bases 
do not represent a hindrance to practicing the invention for those skilled in the art. Several methods 
employing standard recombinant techniques may be used to correct errors and complete the missing 
sequence informatioa (See, e.g., those described in Ausubel, F.M. et al. (1997) Short Protocols in 
Molecular Biology . John Wiley & Sons, New York NY; and Sambrook, J. et al. (1989) Molecular 
Cloning, A Laboratory Manual , Cold Spring Harbor Press, Plainview NY.) 

Assembly of cDNA Sequences 

Human polynucleotide sequences may be assembled using programs or algorithms well known 
in the art. Sequences to be assembled are related, wholly or in part, and may be derived from a single 
or many different transcripts. Assembly of the sequences can be performed using such programs as 
PHRAP (Phils Revised Assembly Program) and the GELVBEW fragment assembly system (GCG), or 
other methods known in the art 

Alternatively, cDNA sequences are used as "component" sequences that are assembled into 
"template" or "consensus" sequences as follows. Sequence chromatograms are processed, verified, and 
quality scores are obtained using PHRED. Raw sequences are edited using an editing pathway known 
as Block 1 (See, e.g., the LIFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA). A series 
of BLAST comparisons is performed and low-information segments and repetitive elements (e.g., 
dinucleotide repeats, Alu repeats, etc.) are replaced by "n's", or masked, to prevent spurious matches. 
Mitochondrial and ribosomal RNA sequences are also removed The processed sequences are then 
loaded into a relational database management system (RDMS) which assigns edited sequences to 
existing templates, if available. When additional sequences are added into the RDMS, a process is 
initiated which modifies existing templates or creates new templates from works in 
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progress (i.e., nonfinal assembled sequences) containing queued sequences or the sequences themselves. 
After the new sequences have been assigned to templates, the templates can be merged into bins. If 
multiple templates exist in one bin, the bin can be split and the templates reannotated 

Once gene bins have been generated based upon sequence alignments, bins are "clone joined" 
5 based upon clone information Clone joining occurs when the 5' sequence of one clone is present in one 
bin and the 3' sequence from the same clone is present in a different bin, indicating that the two bins 
should be merged into a single bin. Only bins which share at least two different clones are merged. 

A resultant template sequence may contain either a partial or a full length open reading frame, 
or all or part of a genetic regulatory element This variation is due in part to the fact that the full length 

10 cDNAs of many genes are several hundred, and sometimes several thousand, bases in length. With 
current technology, cDNAs comprising the coding regions of large genes cannot be cloned because of 
vector limitations, incomplete reverse transcription of the mRNA, or incomplete "second strand" 
synthesis. Template sequences may be extended to include additional contiguous sequences derived 
from the parent RNA transcript using a variety of methods known to those of skill in the art. Extension 

is may thus be used to achieve the full length coding sequence of a gene. 

Analysis of the cDNA Sequences 

The cDNA sequences are analyzed using a variety of programs and algorithms which are well 
known in the art. (See, e.g., Ausubel, 1997, supra , Chapter 7.7; Meyers, R.A. (Ed.) (1995) Molecular 

2 o Biology and Biotechnology . Wiley VCH, New York NY, pp. 856-853; and Table 7.) These analyses 
comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular 
organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318); analyses of potential start and stop 
codons; and homology searches. 

Computer programs known to those of skill in the art for performing computer-assisted 

2 5 searches for amino acid and nucleic acid sequence similarity, include, for example, Basic Local 

Alignment Search Tool (BLAST; Altschul, S.F. (1993) J. Mol. Evol. 36:290-300; Altschul, S.F. et al. 
(1990) J. Mol. Biol. 215:403-410). BLAST is especially useful in determining exact matches and 
comparing two sequence fragments of arbitrary but equal lengths, whose alignment is locally maximal 
and for which the alignment score meets or exceeds a threshold or cutoff score set by the user (Karlin, 

30 S. et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-845). Using an appropriate search tool (e.g., 
BLAST or HMM), GenBank, SwissProt, BLOCKS, PFAM and other databases may be searched for 
sequences containing regions of homology to a query mddt or MDDT of the present invention 

Other approaches to the identification, assembly, storage, and display of nucleotide and 
polypeptide sequences are provided in "Relational Database for Storing Biomolecule Information," 
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U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecular Sequence 
Database," U.S.S.N. 08/81 1,758, filed March 6, 1997; and "Relational Database and System for 
Storing Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, 
all of which are incorporated by reference herein in their entirety. 
5 Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g., motif, 

BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, in 
'Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," 
U.S.S.N. 08/812,290, filed March 6, 1997, incorporated herein by reference. 

io Human Disease Detection and Treatment Molecule Sequences 

The mddt of the present invention may be used for a variety of diagnostic and therapeutic 
purposes. For example, an mddt may be used to diagnose a particular condition, disease, or disorder 
associated with disease detection and treatment molecules. Such conditions, diseases, and disorders 
include, but are not limited to, a cell proliferative disorder, such as actinic keratosis, arteriosclerosis, 

15 atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, 
paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and 
cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, 
teratocarcinoma, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, 
breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, 

20 pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and 
uterus; and an autoimmune/inflammatory disorder, such as actinic keratosis, acquired 
immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, 
allergies, ankylosing spondylitis, amyloidosis, anemia, arteriosclerosis, asthma, atherosclerosis, 
autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, bursitis, cholecystitis, cirrhosis, 

25 contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, 

emphysema, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, 
Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, paroxysmal nocturnal 
hemoglobinuria, hepatitis, hypereosinophilia, irritable bowel syndrome, episodic lymphopenia with 
lymphocytotoxins, mixed connective tissue disease (MCTD), multiple sclerosis, myasthenia gravis, 

3 o myocardial or pericardial inflammation, myelofibrosis, osteoarthritis, osteoporosis, pancreatitis, 
polycythemia vera, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, 
Sjogren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, primary 
thrombocythemia, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, 
complications of cancer, hemodialysis, and extracorporeal circulation, trauma, and hematopoietic 

3 5 cancer including lymphoma, leukemia, and myeloma. The mddt can be used to detect the presence of, 
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or to quantify the amount of, an mddt-related polynucleotide in a sample. This information is then 
compared to information obtained from appropriate reference samples, and a diagnosis is established. 
Alternatively, a polynucleotide complementary to a given mddt can inhibit or inactivate a 
therapeutically relevant gene related to the mddt. 

Analysis of mddt Expression Patterns 

The expression of mddt may be routinely assessed by hybridization-based methods to 
determine, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity of 
mddt expression. For example, the level of expression of mddt may be compared among different cell 
types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at 
different developmental stages, or among cell types or tissues undergoing various treatments. This type 
of analysis is useful, for example, to assess the relative levels of mddt expression in fully or partially 
differentiated cells or tissues, to determine if changes in mddt expression levels are correlated with the 
development or progression of specific disease states, and to assess the response of a cell or tissue to a 
specific therapy, for example, in pharmacological or toxicological studies. Methods for the analysis of 
mddt expression are based on hybridization and amplification technologies and include membrane-based 
procedures such as northern blot analysis, high-throughput procedures that utilize, for example, 
microarrays, and PCR-based procedures. 

Hybridization and Genetic Analysis 

The mddt, their fragments, or complementary sequences, may be used to identify the presence 
of and/or to determine the degree of similarity between two (or more) nucleic acid sequences. The mddt 
may be hybridized to naturally occurring or recombinant nucleic acid sequences under appropriately 
selected temperatures and salt concentrations. Hybridization with a probe based on the nucleic acid 
sequence of at least one of the mddt allows for the detection of nucleic acid sequences, including 
genomic sequences, which are identical or related to the mddt of the Sequence Listing. Probes may be 
selected from non-conserved or unique regions of at least one of the polynucleotides of SEQ ID NO: 1 - 
45 and tested for their ability to identify or amplify the target nucleic acid sequence using standard 
protocols. 

Polynucleotide sequences that are capable of hybridizing, in particular, to those shown in SEQ 
ID NO: 1-45 and fragments thereof, can be identified using various conditions of stringency. (See, e.g., 
WaM, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods 
Enzymol. 152:507-511.) Hybridization conditions are discussed in "Definitions." 

A probe for use in Southern or northern hybridization may be derived from a fragment of an 
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mddt sequence, or its complement, that is up to several hundred nucleotides in length and is either 
single-stranded or double-stranded. Such probes may be hybridized in solution to biological materials 
such as plasmids, bacterial, yeast, or human artificial chromosomes, cleared or sectioned tissues, or to 
artificial substrates containing mddt Microarrays are particularly suitable for identifying the presence 
5 of and detecting the level of expression for multiple genes of interest by examining gene expression 
correlated with, e.g., various stages of development, treatment with a drug or compound, or disease 
progressioa An array analogous to a dot or slot blot may be used to arrange and link polynucleotides 
to the surface of a substrate using one or more of the following: mechanical (vacuum), chemical, 
thermal, or UV bonding procedures. Such an array may contain any number of mddt and may be 

1 o produced by hand or by using available devices, materials, and machines. 

Microarrays may be prepared, used, and analyzed using methods known in the art. (See, e.g., 
Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. 
USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application W095/25 11 16; Shalon, D. et al. 
(1995) PCT application WO95/35505; Heller, RA et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150- 
15 2155 ; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.) 

Probes may be labeled by either PCR or enzymatic techniques using a variety of commercially 
available reporter molecules. For example, commercial kits are available for radioactive and 
chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline phosphatase labeling (Life 
Technologies). Alternatively, mddt may be cloned into commercially available vectors for the 

2 o production of RNA probes. Such probes may be transcribed in the presence of at least one labeled 

nucleotide (e.g., 32 P-ATP, Amersham Pharmacia Biotech). 

Additionally the polynucleotides of SEQ ID NO: 1-45 or suitable fragments thereof can be used 
to isolate full length cDNA sequences utilizing hybridization and/or amplification procedures well 
known in the art, e.g., cDNA library screening, PCR amplification, etc. The molecular cloning of such 
25 full length cDNA sequences may employ the method of cDNA library screening with probes using the 
hybridization, stringency, washing, and probing strategies described above and in Ausubel, supra , 
Chapters 3, 5, and 6. These procedures may also be employed with genomic libraries to isolate 
genomic sequences of mddt in order to analyze, e.g., regulatory elements. 

30 Genetic Mapping 

Gene identification and mapping are important in the investigation and treatment of almost all 
conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, 
diabetes, and mental illnesses are of particular interest Each of these conditions is more complex than 
the single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being 

25 



WO 01/62922 PCTYUS01/05896 

predictive of predisposition for a particular condition, disease, or disorder. For example, 
cardiovascular disease may result from malfunctioning receptor molecules that fail to clear cholesterol 
from the bloodstream, and diabetes may result when a particular individual's immune system is 
activated by an infection and attacks the insulin-producing cells of the pancreas. In some studies, 
5 Alzheimer's disease has been linked to a gene on chromosome 21 ; other studies predict a different gene 
and location. Mapping of disease genes is a complex and reiterative process and generally proceeds 
from genetic linkage analysis to physical mapping. 

As a condition is noted among members of a family, a genetic linkage map traces parts of 
chromosomes that are inherited in the same patten as the condition. Statistics link the inheritance of 

1 o particular conditions to particular regions of chromosomes, as defined by RFLP or other markers. 

, (See, for example, Lander, E. S. and Botstein, D. (1986) Proc. Natl. Acad Sci. USA 83:7353-7357.) 
Occasionally, genetic markers and their locations are known from previous studies. More often, 
however, the markers are simply stretches of DNA that differ among individuals. Examples of genetic 
linkage maps can be found in various scientific journals or at the Online Mendelian Inheritance in Man 
15 (OMIM) World Wide Web site. 

In another embodiment of the invention, mddt sequences may be used to generate hybridization 
probes useful in chromosomal mapping of naturally occurring genomic sequences. Either coding or 
noncoding sequences of mddt may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of an mddt coding sequence among 

2 o members of a multi-gene family may potentially cause undesired cross hybridization during 

chromosomal mapping. The sequences may be mapped to a particular chromosome, to a specific region 
of a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes 
(HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 
constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. et al. (1997) Nat 
25 Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; and Trask, B.J. (1991) Trends Genet. 
7:149-154.) 

Fluorescent in situ hybridization (FISH) may be correlated with other physical chromosome 
mapping techniques and genetic map data. (See, e.g„ Meyers, supra , pp. 965-968.) Correlation 
between the location of mddt on a physical chromosomal map and a specific disorder, or a 

3 o predisposition to a specific disorder, may help define the region of DNA associated with that disorder. 

The mddt sequences may also be used to detect polymorphisms that are genetically linked to the 
inheritance of a particular condition, disease, or disorder. 

In situ hybridization of chromosomal preparations and genetic mapping techniques, such as 
linkage analysis using established chromosomal markers, may be used for extending existing genetic 
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maps. Often the placement of a gene on the chromosome of another mammalian species, such as 
mouse, may reveal associated markers even if the number or arm of the corresponding human 
chromosome is not known. These new marker sequences can be mapped to human chromosomes and 
may provide valuable information to investigators searching for disease genes using positional cloning 
5 or other gene discovery techniques. Once a disease or syndrome has been crudely correlated by genetic 
linkage with a particular genomic region, e.g., ataxia-telangiectasia to 1 lq22-23, any sequences 
mapping to that area may represent associated or regulatory genes for further investigation. (See, e.g., 
Gatti, R. A et al. (1988) Nature 336:577-580.) The nucleotide sequences of the subject invention may 
also be used to detect differences in chromosomal architecture due to translocation, inversion, etc., 

10 among normal, carrier, or affected individuals. 

Once a disease-associated gene is mapped to a chromosomal region, the gene must be cloned in 
order to identity mutations or other alterations (e.g., translocations or inversions) that may be correlated 
with disease. This process requires a physical map of the chromosomal region containing the disease- 
gene of interest along with associated markers. A physical map is necessary for determining the 

is nucleotide sequence of and order of marker genes on a particular chromosomal region. Physical 
mapping techniques are well known in the art and require the generation of overlapping sets of cloned 
DNA fragments from a particular organelle, chromosome, or genome. These clones are analyzed to 
reconstruct and catalog their order. Once the position of a marker is determined, the DNA from that 
region is obtained by consulting the catalog and selecting clones from that region. The gene of interest 

20 is located through positional cloning techniques using hybridization or similar methods. 

Diagnostic Uses 

The mddt of the present invention may be used to design probes useful in diagnostic assays. 
Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, 

2 5 disorders, or diseases associated with abnormal levels of mddt expressioa Labeled probes developed 

from mddt sequences are added to a sample under hybridizing conditions of desired stringency. In some 
instances, mddt, or fragments or oligonucleotides derived from mddt, may be used as primers in 
amplification steps prior to hybridization. The amount of hybridization complex formed is quantified 
and compared with standards for that cell or tissue. If mddt expression varies significantly from the 

3 o standard, the assay indicates the presence of the condition, disorder, or disease. Qualitative or 

quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based 
technologies or multiple-sample format technologies such as PCR, enzyme-linked immunosorbent assay 
(ELISA)-like, pin, or chip-based assays. 
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The probes described above may also be used to monitor the progress of conditions, disorders, 
or diseases associated with abnormal levels of mddt expression, or to evaluate the efficacy of a 
particular therapeutic treatment The candidate probe may be identified from the mddt that are specific 
to a given human tissue and have not been observed in GenBank or other genome databases. Such a 
5 probe may be used in animal studies, preclinical tests, clinical trials, or in monitoring the treatment of 
an individual patient In a typical process, standard expression is established by methods well known in 
the art for use as a basis of comparison, samples from patients affected by the disorder or disease are 
combined with the probe to evaluate any deviation from the standard profile, and a therapeutic agent is 
administered and effects are monitored to generate a treatment profile. Efficacy 

10 is evaluated by determining whether the expression progresses toward or returns to the standard normal 
pattern. Treatment profiles may be generated ova- a period of several days or several months. 
Statistical methods well known to those skilled in the art may be use to determine the significance of 
such therapeutic agents. 

The polynucleotides are also useful for identifying individuals from minute biological samples, 

15 for example, by matching the RFLP pattern of a sample's DNA to that of an individual's DNA The 
polynucleotides of the present invention can also be used to determine the actual base-by-base DNA 
sequence of selected portions of an individual's genome. These sequences can be used to prepare PCR 
primers for amplifying and isolating such selected DNA, which can then be sequenced. Using this 
technique, an individual can be identified through a unique set of DNA sequences. Once a unique ID 

20 database is established for an individual, positive identification of that individual can be made from 
extremely small tissue samples. 

In a particular aspect, oligonucleotide primers derived from the mddt of the invention may be 
used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions and 
deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods of SNP 

2 5 detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) and 

fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from mddt are used to 
amplify DNA using the polymerase chain reaction (PCR). The DNA may be derived, for example, 
from diseased or normal tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause 
differences in the secondary and tertiary structures of PCR products in single-stranded form, and these 

3 o differences are detectable using gel electrophoresis in non-denaturing gels. In fSCCP, the 

oligonucleotide primers are fluorescently labeled, which allows detection of the amplimers in high- 
throughput equipment such as DNA sequencing machines. Additionally, sequence database analysis 
methods, termed in silico SNP (isSNP), are capable of identifying polymorphisms by comparing the 
sequences of individual overlapping DNA fragments which assemble into a common consensus 
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sequence. These computer-based methods filter out sequence variations due to laboratory preparation 
of DNA and sequencing errors using statistical models and automated analyses of DNA sequence 
chromatograms. In the alternative, SNPs may be detected and characterized by mass spectrometry 
using, for example, the high throughput MASS ARRAY system (Sequenom, Inc., San Diego CA). 
5 DNA-based identification techniques are critical in forensic technology. DNA sequences taken 

from very small biological samples such as tissues, e>g., hair or skin, or body fluids, e.g., blood, saliva, 
semen, etc., can be amplified using, e.g., PCR, to identify individuals. (See, e.g., Erlich, H. (1992) 
PCR Technology . Freeman and Co., New York, NY). Similarly, polynucleotides of the present 
invention can be used as polymorphic markers, 
l o There is also a need for reagents capable of identifying the source of a particular tissue. 

Appropriate reagents can comprise, for example, DNA probes or primers prepared from the sequences 
of the present invention that are specific for particular tissues. Panels of such reagents can identify 
tissue by species and/or by organ type. In a similar fashion, these reagents can be used to screen tissue 
cultures for contamination. 

15 The polynucleotides of the present invention can also be used as molecular weight markers on 

nucleic acid gels or Southern blots, as diagnostic probes for the presence of a specific mRNA in a 
particular cell type, in the creation of subtracted cDNA libraries which aid in the discovery of novel 
polynucleotides, in selection and synthesis of oligomers for attachment to an array or other support, and 
as an antigen to elicit an immune response. 

20 Disease Model Systems Using mddt 

The mddt of the invention or their mammalian homologs may be "knocked out" in an animal 
model system using homologous recombination in embryonic stem (ES) cells. Such techniques are well 
known in the art and are useful for the generation of animal models of human disease. (See, e.g., U.S. 
Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For example, mouse ES cells, such as 

25 the mouse 1 29/S vJ cell line, are derived from the early mouse embryo and grown in culture. The ES 
cells are transformed with a vector containing the gene of interest disrupted by a marker gene, e.g., the 
neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292). The vector 
integrates into the corresponding region of the host genome by homologous recombination. 
Alternatively, homologous recombination takes place using the Cre-loxP system to knockout a gene of 

30 interest in a tissue- or developmental stage-specific manna: (Marth, J.D. (1996) Clin Invest 97:1999- 
2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). Transformed ES cells are 
identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain. 
The blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric progeny 
are genotyped and bred to produce heterozygous or homozygous strains. Transgenic animals thus 
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generated may be tested with potential therapeutic or toxic agents. 

The mddt of the invention may also be manipulated in vitro in ES cells derived from human 
blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell lineages 
including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate into, for 
5 example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J. A. et al. (1998) Science 
282:1145-1147). 

The mddt of the invention can also be used to create "knockin" humanized animals (pigs) or 
transgenic animals (mice or rats) to model human disease. With knockin technology, a region of mddt 
is injected into animal ES cells, and the injected sequence integrates into the animal cell genome. 
10 • Transformed cells are injected into blastulae, and the blastulae are implanted as described above. 
Transgenic progeny or inbred lines are studied and treated with potential pharmaceutical agents to 
obtain information on treatment of a human disease. Alternatively, a mammal inbred to overexpress 
mddt, resulting, e.g., in the secretion of MDDT in its milk, may also serve as a convenient source of 
that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 

15 

Screening Assays 

MDDT. encoded by polynucleotides of the present invention may be used to screen for 
molecules that bind to or are bound by the encoded polypeptides. The binding of the polypeptide and 
the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the 
2 o polypeptide or the bound molecule. Examples of such molecules include antibodies, oligonucleotides, 
proteins (e.g., receptors), or small molecules. 

Preferably, the molecule is closely related to the natural ligand of the polypeptide, e.g., a ligand 
or fragment thereof, a natural substrate, or a structural or functional mimetic. (See, Coligan et al., 
(1991) Current Protocols in Immunology 1(2): Chapter 5.) Similarly, the molecule can be closely 

2 5 related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, 

e.g., the active site. In either case, the molecule can be rationally designed using known techniques. 
Preferably, the screening for these molecules involves producing appropriate cells which express the 
polypeptide, either as a secreted protein or on the cell membrane. Preferred cells include cells from 
mammals, yeast, Drosophila . or E. coli . Cells expressing the polypeptide or cell membrane fractions 

3 o which contain the expressed polypeptide are then contacted with a test compound and binding, 

stimulation, or inhibition of activity of either the polypeptide or the molecule is analyzed. 

An assay may simply test binding of a candidate compound to the polypeptide, wherein binding 
is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. Alternatively, 
the assay may assess binding in the presence of a labeled competitor. 
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Additionally, the assay can be carried out using cell-free preparations, polypeptide/molecule 
affixed to a solid support, chemical libraries, or natural product mixtures. The assay may also simply 
comprise the steps of mixing a candidate compound with a solution containing a polypeptide, measuring 
polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity or binding to 
a standard. 

Preferably, anELISA assay using, e.g., a monoclonal or polyclonal antibody, can measure 
polypeptide level in a sample. The antibody can measure polypeptide level by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 

All of the above assays can be used in a diagnostic or prognostic context. The molecules 
discovered using these assays can be used to treat disease or to bring about a particular result in a 
patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule. Moreover, the 
assays can discover agents which may inhibit or enhance the production of the polypeptide from 
suitably manipulated cells or tissues. 

Transcript Imaging and Toxicological Testing 

Another embodiment relates to the use of mddt to develop a transcript image of a tissue or cell 
type. A transcript image represents the global patten of gene expression by a particular tissue or cell 
type. Global gene expression patterns are analyzed by quantifying the number of expressed genes and 
their relative abundance under given conditions and at a given time. (See Seilhamer et al. , 
"Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, expressly incorporated by 
reference herein.) Thus a transcript image may be generated by hybridizing the polynucleotides of the 
present invention or their complements to the totality of transcripts or reverse transcripts of a particular 
tissue or cell type. In one embodiment, the hybridization takes place in high-throughput format, 
wherein the polynucleotides of the present invention or their complements comprise a subset of a 
plurality of elements on a microarray. The resultant transcript image would provide a profile of gene 
activity pertaining to disease detection and treatment molecules. 

Transcript images which profile mddt expression may be generated using transcripts isolated 
from tissues, cell lines, biopsies, or other biological samples. The transcript image may thus reflect 
mddt expression in vivo , as in the case of a tissue or biopsy sample, or in vitro , as in the case of a cell 
line. 

Transcript images which profile mddt expression may also be used in conjunction with in vitro 
model systems and preclinical evaluation of pharmaceuticals, as well as toxicological testing of 
industrial and naturally-occurring environmental compounds. All compounds induce characteristic 
gene expression patterns, frequently termed molecular fingerprints or toxicant signatures, which are 
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indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) MoL Carcinog. 24:153- 
159; Steiner, S. and Anderson, N. L. (2000) Toxicol. Lett 112-113:467-71, expressly incorporated by 
reference herein). If a test compound has a signature similar to that of a compound with known 
toxicity, it is likely to share those toxic properties. These fingerprints or signatures are most useful and 
refined when they contain expression information from a large number of genes and gene families. 
Ideally, a genome-wide measurement of expression provides the highest quality signature. Even genes 
whose expression is not altered by any tested compounds are important as well, as the levels of 
expression of these genes are used to normalize the rest of the expression data. The normalization 
procedure is useful for comparison of expression data after treatment with different compounds. While 
the assignment of gene function to elements of a toxicant signature aids in interpretation of toxicity 
mechanisms, knowledge of gene function is not necessary for the statistical matching of signatures 
which leads to prediction of toxicity. (See, for example, Press Release 00-02 from the National 
Institute of Environmental Health Sciences, released February 29, 2000, available at 
httpy/www.nidis.nih.gov/oc/news/toxchip.htm.) Therefore, it is important and desirable in 
toxicological screening using toxicant signatures to include all expressed gene sequences. 

In one embodiment, the toxicity of a test compound is assessed by treating a biological sample 
containing nucleic acids with the test compound. Nucleic acids that are expressed in the 
treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of the present 
invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences in the transcript levels between the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 

Another particular embodiment relates to the use of MDDT encoded by polynucleotides of the 
present invention to analyze the proteome of a tissue or cell type. The term proteome refers to the 
global pattern of protein expression in a particular tissue or cell type. Each protein component of a 
proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, 
are analyzed by quantifying the number of expressed proteins and their relative abundance under given 
conditions and at a given time. A profile of a cell's proteome may thus be generated by separating and 
analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is 
achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by 
isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl 
sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra) . The proteins are 
visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an 
agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot is 
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generally proportional to the level of the protein in the sample. The optical densities of equivalently 
positioned protein spots from different samples, for example, from biological samples either treated or 
untreated with a test compound or therapeutic agent, are compared to identify any changes in protein 
spot density related to the treatment The proteins in the spots are partially sequenced using, for 
5 example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry. 
The identity of the protein in a spot may be determined by comparing its partial sequence, preferably of 
at least 5 contiguous amino acid residues, to the polypeptide sequences of the present invention. In 
some cases, further sequence data may be obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for MDDT to quantify the 

10 levels of MDDT expression. In one embodiment, the antibodies are used as elements on a microarray, 
and protein expression levels are quantified by exposing the microarray to the sample and detecting the 
levels of protein bound to each array dement (Lueking, A, et al. (1999) Anal. Biochein. 270:103-1 1 ; 
Mendoze, L. G. et al. (1999) Biotechniques 27:778-88). Detection may be performed by a variety of 
methods known in the art, for example, by reacting the proteins in the sample with a thiol- or amino- 

1 5 reactive fluorescent compound and detecting the amount of fluorescence bound at each array element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and should 
be analyzed in parallel with toxicant signatures at the transcript level. There is a poor correlation 
between transcript and protein abundances for some proteins in some tissues (Anderson, N. L. and 
Seilhamer, J. (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the 

2 o analysis of compounds which do not significantly affect the transcript image, but which alter the 
proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid 
degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound Proteins that are expressed in the treated biological 

2 5 sample are separated so that the amount of each protein can be quantified. The amount of each protein 
is compared to the amount of the corresponding protein in an untreated biological sample. A difference 
in the amount of protein between the two samples is indicative of a toxic response to the test compound 
in the treated sample. Individual proteins are identified by sequencing the amino acid residues of the 
individual proteins and comparing these partial sequences to the MDDT encoded by polynucleotides of 

30 the present invention. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins from the biological sample are incubated 
with antibodies specific to the MDDT encoded by polynucleotides of the present invention. The amount 
of protein recognized by the antibodies is quantified The amount of protein in the treated biological 
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sample is compared with the amount in an untreated biological sample. A difference in the amount of 
protein between the two samples is indicative of a toxic response to the test compound in the treated 
sample. 

Transcript images may be used to profile mddt expression in distinct tissue types. This process 
5 can be used to determine disease detection and treatment molecule activity in a particular tissue type 
relative to this activity in a different tissue type. Transcript images may be used to generate a profile of 
mddt expression characteristic of diseased tissue. Transcript images of tissues before and ate 
treatment may be used for diagnostic purposes, to monitor the progression of disease, and to monitor 
the efficacy of drug treatments for diseases which affect the activity of disease detection and treatment 
10 molecules. 

Transcript images of cell lines can be used to assess disease detection and treatment molecule 
activity and/or to identify cell lines that lack or misregulate this activity. Such cell lines may then be 
treated with pharmaceutical agents, and a transcript image following treatment may indicate the 
efficacy of these agents in restoring desired levels of this activity. A similar approach may be used to 
is assess the toxicity of pharmaceutical agents as reflected by undesirable changes in disease detection and 
treatment molecule activity. Candidate pharmaceutical agents may be evaluated by comparing their 
associated transcript images with those of pharmaceutical agents of known effectiveness. 



Antisense Molecules 

2 o The polynucleotides of the present invention are useful in antisense technology. Antisense 

technology or therapy relies on the modulation of expression of a target protein through the specific 
binding of an antisense sequence to a target sequence encoding the target protein or directing its 
expression. (See, e.g., Agrawal, S., ed (1996) Antisense Therapeutics . Humana Press Inc., Totawa 
NJ; Alama, A et al. (1997) Pharmacol. Res. 36(3):171-178; Crooke, S.T. (1997) Adv. Pharmacol. 

25 40:1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12):1055-1063; and Lavrosky, Y. et 
al. (1997) Biochem. Mol. Med 62(l):ll-22.) An antisense sequence is a polynucleotide sequence 
capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences 
bind to cellular mRNA and/or genomic DNA, affecting translation and/or transcription. Antisense 
sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) 

30 Antisense Res. Dev. l(3):285-288; Lee, R. et al. (1998) Biochemistry 37(3):900-1010; Pardridge, 
W.M. et al. (1995) Proc. Natl. Acad ScL USA 92(12):5592-5596; and Nielsen, P. E. and Haaima, G. 
(1997) Chem. Soc. Rev. 96:73-78.) Typically, the binding which results in modulation of expression 
occurs through hybridization or binding of complementary base pairs. Antisense sequences can also 
bind to DNA duplexes through specific interactions in the major groove of the double helix. 
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The polynucleotides of the present invention and fragments thereof can be used as antisense 
sequences to modify the expression of the polypeptide encoded by mddt The antisense sequences can 
be produced ex vivo, such as by using any of the ABI nucleic acid synthesizer series (Applied 
Biosystems) or other automated systems known in the art. Antisense sequences can also be produced 
5 biologically, such as by transforming an appropriate host cell with an expression vector containing the 
sequence of interest. (See, e.g., Agrawal, supra .) 

In therapeutic use, any gene delivery system suitable for introduction of the antisense sequences 
into appropriate target cells can be used. Antisense sequences can be delivered intracellularly in the 
form of an expression plasmid which, upon transcription, produces a sequence complementary to at 

1 o least a portion of the cellular sequence encoding the target protein. (See, e.g., Slater, J.E., et al. (1998) 

J. Allergy Clin Immunol. 102(3):469-475; and Scanlon, K.J., et al. (1995) 9(13):1288-1296.) 
Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as 
retrovirus and adeno-associated virus vectors. (See, e.g., Miller, A.D. (1990) Blood 76:271; Ausubel, 
F.M. et al. (1995) Current Protocols in Molecular Biology , John Wiley & Sons, New York NY; Uckert, 
15 W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other gene delivery mechanisms include 
liposome-derived systems, artificial viral envelopes, and other systems known in the art. (See, e.g., 
Rossi, J J. (1995) Br. Med. Bull. 51(l):217-225; Boado, R.J. et al. (1998) J. Pharm. Sci. 87(11):1308- 
1315; and Morris, M.C. et al. (1997) Nucleic Acids Res. 25(14):2730-2736.) 

20 Expression 

In order to express a biologically active MDDT, the nucleotide sequences encoding MDDT or 
fragments thereof may be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translational control of the inserted coding sequence in a 
suitable host Methods which are well known to those skilled in the art may be used to construct 

2 5 expression vectors containing sequences encoding MDDT and appropriate transcriptional and 

translational control elements. These methods include in vitro recombinant DN A techniques, synthetic 
techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra , Chapters 4, 8, 16, and 17; 
and Ausubel, supra . Chapters 9, 10, 13, and 16.) 

A variety of expression vector/host systems may be utilized to contain and express sequences 

3 o encoding MDDT. These include, but are not limited to, microorganisms such as bacteria transformed 

with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with 
yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); 
plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, or 
tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or 
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animal (mammalian) cell systems. (See, e.g., Sambrook, supra ; Ausubel, 1995, supra . Van Heeke, G. 
and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Bitter, G.A et al. (1987) Methods Enzymol. 
153:516-544; Scorer, C.A. et al. (1994) Bio/Technology 12:181-184; Engelhard, E.K. et al. (1994) 
Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. GeneTher. 7:1937-1945; 
5 Takamatsu, N. (1987) EMBO J. 6:307-311; Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; BrogUe, 
R. et al. (1984) Science 224:838-843; Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105; 
The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York NY, pp. 
191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; and Harrington, 
J. J. et al. (1997) Nat. Genet. 15:345-355.) Expression vectors derived from retroviruses, adenoviruses, 

10 or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for delivery of nucleotide 
sequences to the targeted organ, tissue, or cell population. (See, e.g., Di Nicola, M. et al. (1998) 
Cancer Gen. Ther. 5(6):350-356; Yu, M. et al., (1993) Proc. Natl. Acad Sci. USA 90(13):6340-6344; 
Bulla:, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. (1994) MoL Immunol. 
31(3):219-226; and Verma, I.M. and N. Somia (1997) Nature 389:239-242.) The invention is not 

1 5 limited by the host cell employed. 

For long term production of recombinant proteins in mammalian systems, stable expression of 
MDDT in cell lines is preferred For example, sequences encoding MDDT can be transformed into cell 
lines using expression vectors which may contain viral origins of replication and/or endogenous 
expression elements and a selectable marker gene on the same or on a separate vector. Any number of 

20 selection systems may be used to recover transformed cell lines. (See, e.g., Wigler, M. et al. (1977) 
Cell 11:223-232; Lowy, I. et al. (1980) Cell 22:817-823.; Wigler, M. et al, (1980) Proc. Natl. Acad 
Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 150:1-14; Hartman, S.C. and 
R.C.Mulligan (1988) Proc. Natl. Acad. Sci. USA 85:8047-8051; Rhodes, C.A. (1995) Methods Mol. 
Biol. 55:121-131.) 

25 

Therapeutic Uses of mddt 

The mddt of the invention may be used for somatic or germline gene therapy. Gene therapy 
may be performed to (i) correct a genetic deficiency (e,g., in the cases of severe combined 
immunodeficiency (SCID)-X1 disease characterized by X-linked inheritance (Cavazzana-Calvo, M. et 
30 al. (2000) Science 288 :669-672), severe combined immunodeficiency syndrome associated with an 
inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995) Science 270:475-480; 
Bordignon, C. et al. (1995) Science 270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207- 
216; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene 
Therapy 6:667-703), thalassemias, familial hypercholesterolemia, and hemophilia resulting from Factor 
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Vm or Factor K deficiencies (Crystal, R.G. (1995) Science 270:404-410; Verma, I.M. and Somia, N. 
(1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product (e.g., in the case of 
cancers which result from unregulated cell proliferation), or (iii) express a protein which affords 
protection against intracellular parasites (e.g., against human retroviruses, such as human 
inmiunodeficiency virus (fflV) (Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996) 
Proc. Natl. Acad. Sci. USA. 93:1 1395-11399), hepatitis B or C virus (HBV, HCV); fungal parasites, 
such as Candida albicans and Paracoccidioides brasiliensis : and protozoan parasites such as 
Plasmodium falciparum and Try panosoma cruzi) . In the case where a genetic deficiency in mddt 
expression or regulation causes disease, the expression of mddt from an appropriate population of 
transduced cells may alleviate the clinical manifestations caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in mddt 
are treated by constructing mammalian expression vectors comprising mddt and introducing these 
vectors by mechanical means into mddt-deficient cells. Mechanical transfer technologies for use with 
cells in vivo or ex vitro include (i) direct DNA miaoinjection into individual cells, (ii) ballistic gold 
particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the 
use of DNAtransposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev. Biochem 62:191-217; 
Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. and R6cipon, H. (1998) Curr. Opin Biotechnol. 9:445- 
450). 

Expression vectors that may be effective for the expression of mddt include, but are not limited 
to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), 
PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), and PTET-OFF, 
PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto C A). The mddt of the invention 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous 
sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or p-actin genes), (ii) an inducible promoter 
(e.g., the tetracycline-regulated promoter (Gossan, M. and Bujard, H. (1992) Proc. Natl. Acad. Sci. 
U.S.A 89:5547-5551; Gossen, M. et al., (1995) Science 268:1766-1769; Rossi, F.M.V. and Blau, 
H.M. (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid 
(Invitrogen); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; 
Invitrogen); the FK506/rapamycin inducible promote; or the RU486/mifepristone inducible promoter 
(Rossi, F.M. V. and Blau, H.M. supra) , or (iii) a tissue-specific promoter or the native promote* of the 
endogenous gene encoding MDDT from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
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parameters. In the alternative, transformation is performed using the calcium phosphate method 
(Graham, RL. and Eb, AJ. (1973) Virology 52:456-467), or by electroporation (Neumann, E. et aL 
(1982) EMBO J. 1 :841-845). The introduction of DNA to primary cells requires modification of these 
standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to mddt expression are treated by constructing a retrovirus vector consisting of (i) mddt under 
the control of an independent promoter or the retrovirus long terminal repeat (LTR) promoter, (ii) 
appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) along with additional 
retrovirus ds-acting RNA sequences and coding sequences required for efficient vector propagation. 
Retrovirus vectors (e.g., PFB and PFBNEO) are commercially available (Stratagene) and are based on 
published data (Riviere, I. et aL (1995) Proc. Natl. Acad. Sci. U.S.A. 92:6733-6737), incorporated by 
reference herein. The vector is propagated in an appropriate vector producing cell line (VPCL) that 
expresses an envelope gene with a tropism for receptors on the target cells or a promiscuous envelope 
protein such as VSVg (Armentano, D. et al. (1987) J. Virol. 61:1647-1650; Bender, M.A et al. (1987) 
J. Virol. 61:1639-1646; Adam, M.A and Miliar, AD. (1988) J. Virol. 62:3802-3806; Dull, T. et al. 
(1998) J. Virol. 72:8463-8471; Zufferey, R. et al. (1998) J. Virol. 72:9873-9880). U.S. Patent Number 
5,910,434 to Rigg ("Method for obtaining retrovirus packaging cell lines producing high transducing 
efficiency retroviral supernatant") discloses a method for obtaining retrovirus packaging cell lines and 
is hereby incorporated by reference. Propagation of retrovirus vectors, transduction of a population of 
cells (e.g., CD4 + T-cells), and the return of transduced cells to a patient are procedures well known to 
parsons skilled in the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. 
Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 
71:4707-4716; Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. U.S.A. 95:1201-1206; Su, L. (1997) 
Blood 89:2283-2290). 

In the alternative, an adenovirus-based gene therapy delivery system is used to deliver mddt to 
cells which have one or more genetic abnormalities with respect to the expression of mddt. The 
construction and packaging of adenovirus-based vectors are well known to those with ordinary skill in 
the art. Replication defective adenovirus vectors have proven to be versatile for importing genes 
encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995) 
Transplantation 27:263-268). Potentially usefiil adenoviral vectors are described in U.S. Patent 
Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby incorporated by 
reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) Annu. Rev. Nutr. 19:511-544 
and Verma, I.M. and Somia, N. (1997) Nature 18:389:239-242, both incorporated by reference herein. . 



38 



WO 01/62922 



PCT/US01/05896 



In another alternative, a herpes-based, gene therapy delivery system is used to deliver mddt to 
target cells which have one or more genetic abnormalities with respect to the expression of mddt The 
use of hopes simplex virus (HSV)-based vectors may be especially valuable for introducing mddt to 
ceils of the central nervous system, for which HS V has a tropism. The construction and packaging of 
5 herpes-based vectors are well known to those with ordinary skill in the art. A replication-competent 
herpes simplex virus (HSV) type 1 -based vector has been used to deliver a reporter gene to the eyes of 
primates (Liu, X. et al (1999) Exp. Eye Res.l69:385-395). The construction of a HSV-1 virus vector 
has also been disclosed in detail in U.S. Patent Number 5,804,413 to DeLuca ("Herpes simplex virus 
strains for gene transfer"), which is hereby incorporated by reference. U.S. Patent Number 5,804,41 3 

10 teaches the use of recombinant HSV d92 which consists of a genome containing at least one exogenous 
gene to be transferred to a cell under the control of the appropriate promoter for purposes including 
human gene therapy. Also taught by this patent are the construction and use of recombinant HSV 
strains deleted for ICP4, ICP27 and ICP22. For HSV vectors, see also Goins, W. F. et al. 1999 J. 
Virol. 73:519-532 and Xu, H. et al., (1994) Dev. Biol. 163:152-161, hereby incorporated by reference. 

1 5 The manipulation of cloned herpesvirus sequences, the generation of recombinant virus following the 
transfection of multiple plasmids containing different segments of the large herpesvirus genomes, the 
growth and propagation of herpesvirus, and the infection of cells with herpesvirus are techniques well 
known to those of ordinary skill in the art. 

In another alternative, an alphavirus (positive, single-stranded RNA virus) vector is used to 

2 o deliver mddt to target cells. The biology of the prototypic alphavirus, Semliki Forest Virus (SFV), has 
been studied extensively and gene transfer vectors have been based on the SFV genome (Garoff, H. and 
Li, K-J. (1998) Curr. Opin. Biotech. 9:464-469). During alphavirus RNA replication, a subgenomic 
RNA is generated that normally encodes the viral capsid proteins. This subgenomic RNA replicates to 
higher levels than the full-length genomic RNA, resulting in the overproduction of capsid proteins 

2 5 relative to the viral proteins with enzymatic activity (e.g., protease and polymerase). Similarly, 

inserting mddt into the alphavirus genome in place of the capsid-coding region results in the production 
of a large number of mddt RNAs and the synthesis of high levels of MDDT in vector 
transduced cells. While alphavirus infection is typically associated with cell lysis within a few days, the 
ability to establish a persistent infection in hamster normal kidney cells (BHK-21) with a variant of 

3 o Sindbis virus (SIN) indicates that the lytic replication of alphaviruses can be altered to suit the needs of 

the gene therapy application (Dryga, S. A et al. (1997) Virology 228:74-83). The wide host range of 
alphaviruses will allow the introduction of mddt into a variety of cell types. The specific transduction 
of a subset of cells in a population may require the sorting of cells prior to transductioa The methods 
of manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA 
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transfections, and performing alphavirus infections, are well known to those with ordinary skill in the 
art. 

Antibodies 

5 Anti-MDDT antibodies may be used to analyze protein expression levels. Such antibodies 

include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, and Fab fragments. For 
descriptions of and protocols of antibody technologies, see, e.g., Pound J.D. (1998) Immunochemical 
Protocols . Humana Press, Totowa, NJ. 

The amino acid sequence encoded by the mddt of the Sequence Listing may be analyzed by 

10 appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to determine regions 
of high immunogenicity. The optimal sequences for immunization are selected from the C -terminus, the 
N-terminus, and those intervening, hydrophilic regions of the polypeptide which are likely to be exposed 
to the external environment when the polypeptide is in its natural conformation. Analysis used to select 
appropriate epitopes is also described by Ausubel (1997, supra , Chapter 1 1 .7). Peptides used for 

1 5 antibody induction do not need to have biological activity; however, they must be antigenic. Peptides 
used to induce specific antibodies may have an amino acid sequence consisting of at least five amino 
acids, preferably at least 10 amino acids, and most preferably at least 15 amino acids. A peptide which 
mimics an antigenic fragment of the natural polypeptide may be fused with another protein such as 
keyhole hemolimpet cyanin (KLH; Sigma, St. Louis MO) for antibody production. A peptide 

2 o encompassing an antigenic region may be expressed from an mddt, synthesized as described above, or 

purified from human cells. 

Procedures well known in the art may be used for the production of antibodies. Various hosts 
including mice, goats, and rabbits, may be immunized by injection with a peptide. Depending on the 
host species, various adjuvants may be used to increase immunological response. 
25 In one procedure, peptides about 15 residues in length may be synthesized using an ABI 43 1 A 

peptide synthesizer (Applied Biosystems) using finoc-chemistry and coupled to KLH (Sigma) by 
reaction with M-maldmidobenzoyl-N-hydroxysuccinimide ester (Ausubel, 1995, supra) . Rabbits are 
immunized with the peptide-KLH complex in complete Freund's adjuvant. The resulting antisera are 
tested for antipeptide activity by binding the peptide to plastic, blocking with 1 % bovine serum albumin 

3 o (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG. 

Antisera with antipeptide activity are tested for anti-MDDT activity using protocols well known in the 
art, including ELISA, radioimmunoassay (RIA), and immunoblotting. 

In another procedure, isolated and purified peptide may be used to immunize mice (about 100 
|ig of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodinated and used 
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to screen the immunized animals' B-lymphocytes for production of antipeptide antibodies. Positive 
cells are then used to produce hybridomas using standard techniques. About 20 mg of peptide is 
sufficient for labeling and screening several thousand clones. Hybridomas of interest are detected by 
screening with radioiodinated peptide to identify those fusions producing peptide-specific monoclonal 
5 antibody. In a typical protocol, wells of a multi-well plate (FAST, Becton-Dickinson, Palo Alto, CA) 
are coated with affinity-purified, specific rabbit-anti-mouse (or suitable anti-species IgG) antibodies at 
10 mg/mL Hie coated wells are blocked with 1 % BS A and washed and exposed to supernatants from 
hybridomas. After incubation, the wells are exposed to radiolabeled peptide at 1 mg/ml. 

Clones producing antibodies bind a quantity of labeled peptide that is detectable above 

10 background Such clones are expanded and subjected to 2 cycles of cloning. Cloned hybridomas are 
injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the 
ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia Biotech). Several 
procedures for the production of monoclonal antibodies, including in vitro production, are described in 
Pound (supra) . Monoclonal antibodies with antipeptide activity are tested for anti-MDDT activity 

1 5 using protocols well known in the art, including ELIS A, RIA, and immunoblotting. 

Antibody fragments containing specific binding sites for an epitope may also be generated. For 
example, such fragments include, but are not limited to, the F(ab')2 fragments produced by pepsin 
digestion of the antibody molecule, and the Fab fragments generated by reducing the disulfide bridges of 
the F(ab')2 fragments. Alternatively, construction of Fab expression libraries in filamentous 

2 o bacteriophage allows rapid and easy identification of monoclonal fragments with desired specificity 
(Pound, supra . Chaps. 45-47). Antibodies generated against polypeptide encoded by mddt can be used 
to purify and characterize full-length MDDT protein and its activity, binding partners, etc. 

Assays Using Antibodies 

2 5 Anti-MDDT antibodies may be used in assays to quantify the amount of MDDT found in a 

particular human cell. Such assays include methods utilizing the antibody and a label to detect 
expression level under normal or disease conditions. The peptides and antibodies of the invention may 
be used with or without modification or labeled by joining them, either covalently or noncovalently, 
with a reporter molecule. 

3 o Protocols for detecting and measuring protein expression using either polyclonal or monoclonal 

antibodies are well known in the art Examples include ELIS A, RIA, and fluorescent activated cell 
sorting (FACS). Such immunoassays typically involve the formation of complexes between the MDDT 
and its specific antibody and the measurement of such complexes. These and other assays are described 
in Pound (supra). 
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Without further elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize the present invention to its fullest extent. The following preferred specific 
embodiments are, therefore, to be construed as merely illustrative, and not limitative of the remainder of 
the disclosure in any way whatsoever. 

The disclosures of all patents, applications, and publications mentioned above and below, in 
particular U.S. Ser. No. 60/185,213, U.S. Ser. No. 60/205,285, U.S. Ser. No. 60/205,232, U.S. Ser. 
No. 60/205,323, U.S. Ser. No. 60/205,287, U.S. Ser. No. 60/205,324, and U.S. Ser. No. 60/205,286, 
are hereby expressly incorporated by reference. 

EXAMPLES 

I. Construction of cDNA Libraries 

RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated from 
various tissues. Some tissues were homogenized and lysed in guanidinium isothiocyanate, while others 
ware homogenized and lysed in phenol or in a suitable mixture of denaturants, such as TRIZOL (Life 
Technologies), a monophasic solution of phenol and guanidine isothiocyanate. The resulting lysates 
were centrifuged ova: CsCl cushions or extracted with chloroform. RNA was precipitated with either 
isopropanol or sodium acetate and ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 
purity. In most cases, RNA was treated with DNase. For most libraries, poly(A+) RNA was isolated 
using oligo d(T)-coupled paramagnetic particles (Promega Corporation (Promega), Madison WI), 
OLIGOTEX latex particles (QIAGEN, Inc. (QIAGEN), Valencia C A), or an OLIGOTEX mRNA 
purification kit (QIAGEN). Alternatively, RNA was isolated directly from tissue lysates using other 
RNA isolation kits, e.g., the POLY(A)PURE mRNA purification kit (Ambion, Inc., Austin TX). 

In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector system (Stratagene Cloning Systems, Inc. (Stratagene), La Jolla CA) or SUPERSCRIPT 
plasmid system (Life Technologies), using the recommended procedures or similar methods known in 
the art (See, e.g., Ausubel, 1997, supra , Chapters 5.1 through 6.6.) Reverse transcription was 
initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to double 
stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or enzymes. For 
most libraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000, SEPHAROSE 
CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia Biotech) or 
preparative agarose gel electrophoresis. cDNAs wore ligated into compatible restriction enzyme sites of 
thepolylinker of a suitable plasmid, e,g., PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid 
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(Life Technologies), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), 
or pINCY (Incyte Gnomics, Palo Alto CA), or derivatives thereof. Recombinant plasmids ware 
transformed into competent E. coli cells including XL 1 -Blue, XLl-BlueMRF, or SOLR from 
Stratagene or DH5a, DH10B, or HectroMAX DH10B from Life Technologies. 

5 

II. Isolation of cDNA Clones 

Plasmids were recovered from host cells by in vivo excision using the UNIZAP vector system 
(Stratagene) or by cell lysis. Plasmids were purified using at least one of the following: the Magic or 
WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge 
10 BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and QIAWELL 8 Ultra 
plasmid purification systems or the R.E.AL. PREP 96 plasmid purification kit (QIAGEN). Following 
precipitation, plasmids were resuspended in 0. 1 ml of distilled water and stored, with or without 
lyqphilization, at 4°C. 

Alternatively, plasmid DNA was amplified from host cell lysates using direct link PCR in a 
15 high-throughput format. (Rao, V.B. (1994) Anal. Biochem. 216:1-14.) Host cell lysis and thermal 
cycling steps wore carried out in a single reaction mixture. Samples ware processed and stored in 384- 
well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically using 
PICOGREEN dye (Molecular Probes, Inc. (Molecular Probes), Eugene OR) and aFLUOROSKAN H 
fluorescence scanner (Labsystems Oy, Helsinki, Finland). 

20 

III. Sequencing and Analysis 

cDNA sequencing reactions ware processed using standard methods or high-throughput 
instrumentation such as the ABI CATALYST 800 thermal cycler (Applied Biosystems) or the PTC- 
200 thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific 

2 5 Corp., Sunnyvale CA) or the MICROLAB 2200 liquid transfer system (Hamilton). cDNA sequencing 
reactions were prepared using reagents provided by Amersham Pharmacia Biotech or supplied in ABI 
sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit 
(Applied Biosystems). Electrophoretic separation of cDNA sequencing reactions and detection of 
labeled polynucleotides ware carried out using the MEGAB ACE 1000 DNA sequencing system 

3 o (Molecular Dynamics); the ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in 

conjunction with standard ABI protocols and base calling software; or other sequence analysis systems 
known in the art. Reading frames within the cDN A sequences were identified using standard methods 
(reviewed in Ausubel, 1997, supra . Chapter 7.7). Some of the cDNA sequences were selected for 
extension using the techniques disclosed in Example VIII. 
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IV. Assembly and Analysis of Sequences 

Component sequences from chromatograms ware subject to PHRED analysis and assigned a 
quality score. The sequences having at least a required quality score were subject to various pre- 
processing editing pathways to eliminate, e.g., low quality 3* ends, vector and linker sequences, polyA 
5 tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial contamination sequences, and 
sequences smaller than 50 base pairs. In particular, low-information sequences and repetitive dements 
(e.g., dinucleotide repeats, Alu repeats, etc.) were replaced by "n's", or masked, to prevent spurious 
matches. 

Processed sequences ware then subject to assembly procedures in which the sequences were 

1 o assigned to gene bins (bins). Each sequence could only belong to one bin. Sequences in each gene bin 

ware assembled to produce consensus sequences (templates). Subsequent new sequences ware added to 
existing bins using BLASTn (v.l A WashU) and CROSSMATCH. Candidate pairs were identified as 
all BLAST hits having a quality score greater than or equal to 150. Alignments of at least 82% local 
identity were accepted into the bin. The component sequences from each bin were assembled using a 
is version of PHRAP. Bins with several overlapping component sequences were assembled using DEEP 
PHRAP. The orientation (sense or antisense) of each assembled template was determined based on the 
number and orientation of its component sequences. Template sequences as disclosed in the sequence 
listing correspond to sense strand sequences (the "forward" reading frames), to the best detennination. 
The complementary (antisense) strands are inherently disclosed herein. The component sequences 

2 o which were used to assemble each template consensus sequence are listed in Table 4, along with their 

positions along the template nucleotide sequences. 

Bins ware compared against each other and those having local similarity of at least 82% were 
combined and reassembled Reassembled bins having templates of insufficient overlap (less than 95% 
local identity) were re-split. Assembled templates were also subject to analysis by STITCHER/EXON 

2 5 MAPPER algorithms which analyze the probabilities of the presence of splice variants, alternatively 

spliced exons, splice junctions, differential expression of alternative spliced genes across tissue types or 
disease states, etc. These resulting bins were subject to several rounds of the above assembly 
procedures. 

Once gene bins were generated based upon sequence alignments, bins were clone joined based 

3 o upon clone information. If the 5' sequence of one clone was present in one bin and the 3 1 sequence from 

the same clone was present in a different bin, it was likely that the two bins actually belonged together 
in a single bin. The resulting combined bins underwent assembly procedures to regenerate the 
consensus sequences. 
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The final assembled templates were subsequently annotated using the following procedure. 
Template sequences were analyzed using BLASTn (v2.0, NCBI) versus gbpri (GenBank version 120), 
"Hits" ware defined as an exact match having from 95% local identity over 200 base pairs through 
100% local identity over 100 base pairs, or a homolog match having an E-value, i.e. a probability 
score, of <. 1 x 10' 8 . The hits were subject to frameshift FASTx versus GENPEPT (GenBank version 
120). (See Table 7). In this analysis, a homolog match was defined as having an E-value of <slxl0" 8 . 
The assembly method used above was described in "System and Methods for Analyzing Biomolecular 
Sequences," U.S.S.N. 09/276,534, filed March 25, 1999, and the LIFESEQ Gold user manual (Incyte) 
both incorporated by ref erence herein. 

Following assembly, template sequences were subjected to motif, BLAST, and functional 
analyses, and categorized in protein hierarchies using methods described in, e.g., "Database System 
Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 
08/812,290, filed March 6, 1997; "Relational Database for Storing Biomolecule Information," 
U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecular Sequence 
Database," U.S.S.N. 08/811,758, filed March 6, 1997; and "Relational Database and System for 
Storing Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, 
all of which are incorporated by reference herein. 

The template sequences were further analyzed by translating each template in all three forward 
reading frames and searching each translation against the Pf am database of hidden Markov model- 
based protein families and domains using the HMMER software package (available to the public from 
Washington University School of Medicine, St Louis MO). Regions of templates which, when 
translated, contain similarity to Pfam consensus sequences are reported in Table 2, along with 
descriptions of Pfam protein domains and families. Only those Pfam hits with an E-value of <> 1 x 10" 3 
are reported. (See also World Wide Web site http://pfam.wustl.edu/ for detailed descriptions of Pfam 
protein domains and families.) 

Additionally, the template sequences were translated in all three forward reading frames, and 
each translation was searched against hidden Markov models for signal peptides using the HMMER 
software package. Construction of hidden Markov models and their usage in sequence analysis has 
been described (See, for example, Eddy, S.R. (1996) Curr. Opin. Str. Biol. 6:361-365.) Only those 
signal peptide hits with a cutoff score of 1 1 bits or greater are reported. A cutoff score of 1 1 bits or 
greater corresponds to at least about 91-94% true-positives in signal peptide predictioa Tenqrtate 
sequences were also translated in an three forward reading frames, and each translation was searched 
against TMAP, a program that uses weight matrices to delineate transmembrane segments on protein 
sequences and determine orientation, with respect to the cell cytosol (Persson, B. and P. Argos (1994) J. 
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Mol. Biol. 237:182-192; Persson, B. and P. Argos (1996) Protein Sci. 5:363-371.) Regions of 
templates which, when translated, contain similarity to signal peptide or transmembrane consensus 
sequences are reported in Table 3. 

The results of HMMER analysis as reported in Tables 2 and 3 may support the results of 
5 BLAST analysis as reported in Table 1 or may suggest alternative or additional properties of template- 
encoded polypeptides not previously uncovered by BLAST or other analyses. 

Template sequences are further analyzed using the bioinformatics tools listed in Table 7, or 
using sequence analysis software known in the art such as MACDNASIS PRO software (Hitachi 
Software Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Template 

10 sequences may be further queried against public databases such as the GenBank rodent, mammalian, 
vertebrate, prokaryote, and eukaryote databases. 

The template sequences were translated to derive the corresponding longest open reading frame 
as presented by the polypeptide sequences. Alternatively, a polypeptide of the invention may begin at 
any of the methionine residues within the full length translated polypeptide. Polypeptide sequences 

1 5 were subsequently analyzed by querying against the GenBank protein database (GENPEPT, (GenBank 
version 121)). Full length polynucleotide sequences are also analyzed using MACDNASIS PRO 
software (Hitachi Software Engineering, South San Francisco CA) and LASERGENE software 
(DNASTAR). Polynucleotide and polypeptide sequence alignments are generated using default 
parameters specified by the CLUSTAL algorithm as incorporated into the MEGALIGN multisequence 

2 o alignment program (DNASTAR), which also calculates the percent identity between aligned sequences. 

Table 6 shows sequences with homology to the polypeptides of the invention as identified by 
BLAST analysis against the GenBank protein (GENPEPT) database. Column 1 shows the polypeptide 
sequence identification number (SEQ ID NO:) for the polypeptide segments of the invention. Column 2 
shows the reading frame used in the translation of the polynucleotide sequences encoding the 

25 polypeptide segments. Column 3 shows the length of the translated polypeptide segments. Columns 4 
and 5 show the start and stop nucleotide positions of the polynucleotide sequences encoding the 
polypeptide segments. Column 6 shows the GenBank identification number (GI Number) of the nearest 
GenBank homolog. Column 7 shows the probability score for the match between each polypeptide and 
its GenBank homolog. Column 8 shows the annotation of the GenBank bomolog. 

30 V* Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene 
and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs from a 
particular cell type or tissue have been bound. (See, e.g., Sambrook, supra , chu 7; Ausubd, 1995, 
supra , ch. 4 and 16.) 
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Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is 
much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer 
search can be modified to determine whether any particular match is categorized as exact or similar. 
5 The basis of the search is the product score, which is defined as : 

BLAST Score x Percent Identity 
5 x minimum {length(Seq. 1), length(Seq. 2)} 

1 o The product score takes into account both the degree of similarity between two sequences and the length 

of the sequence match. The product score is a normalized value between 0 and 100, and is calculated 
as follows: the BLAST score is multiplied by the percent nucleotide identity and the product is divided 
by (5 times the length of the shorter of the two sequences). The BLAST score is calculated by 
assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and -4 for 
15 every mismatch. Two sequences may share more than one HSP (separated by gaps). If there is more 
than one HSP, then the pair with the highest BLAST score is used to calculate the product score. The 
product score represents a balance between fractional overlap and quality in a BLAST alignment. For 
example, a product score of 100 is produced only for 100% identity over the entire length of the shorter 
of the two sequences being compared. A product score of 70 is produced either by 100% identity and 

2 o 70% overlap at one end, or by 88% identity and 1 00% overlap at the other. A product score of 50 is 

produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap, 

VL Tissue Distribution Profiling 

A tissue distribution profile is determined for each template by compiling the cDNA library 

2 5 tissue classifications of its component cDNA sequences. Each component sequence, is derived from a 

cDNA library constructed from a human tissue. Each human tissue is classified into one of the 
following categories: cardiovascular system; connective tissue; digestive system; embryonic structures; 
endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; hemic and immune 
system; liver; musculoskeletal system; nervous system; pancreas; respiratory system; sense organs; 

3 o skin; stomatognathic system; unclassified/mixed; or urinary tract. Template sequences, component 

sequences, and cDNA library/tissue information are found in the LIFESEQ GOLD database (Incyte 
Genomics, Palo Alto CA). 

Table 5 shows the tissue distribution profile for the templates of the invention. For each 
template, the three most frequently observed tissue categories are shown in column 3, along with the 
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percentage of component sequences belonging to each category. Only tissue categories with 
percentage values of ^ 10% are shown. A tissue distribution of "widely distributed" in column 3 
indicates percentage values of <10% in all tissue categories. 



5 VII. Transcript Image Analysis 

Transcript images are generated as described in Seilhamer et al., "Comparative Gene 
Transcript Analysis," U.S. Patent Number 5,840,484, incorporated herein by reference. 



VIII. Extension of Polynucleotide Sequences and Isolation of a Full-length cDNA 

1 o Oligonucleotide primers designed using an mddt of the Sequence Listing are used to extend the 

nucleic acid sequence. One primer is synthesized to initiate 5' extension of the template, and the other 
prima:, to initiate 3' extension of the template. The initial primers may be designed using OLIGO 4.06 
software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or another appropriate 
program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to 
is anneal to the target sequence at temperatures of about 68 °C to about 72°C. Any stretch of nucleotides 
which would result in hairpin structures and primer-primer dimerizations are avoided. Selected human 
cDNA libraries are used to extend the sequence. If more than pne extension is necessary or desired, 
additional or nested sets of primers are designed 

High fidelity amplification is obtained by PCR using methods well known in the art. PCR is 

2 o performed in 96- well plates using the PTC-200 thermal cycler. (MJ Research). The reaction mix 

contains DNA template, 200 nmol of each prima:, reaction buffo: containing Mg^, (NH 4 ) 2 S0 4 , and fi- 
mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme (Life 
Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair 
PCI A and PCI B: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 
25 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C. In the 
alternative, the parameters for primer pair T7 and SK+ are as follows: Step 1: 94°C, 3 min; Step 2: 
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to determine which reactions are successful in extending the sequence. 

The extended nucleotides are desalted and concentrated, transferred to 384-well plates, 
digested with CvLFI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 1 8 vector (Amersham Pharmacia Biotech). For 
5 shotgun sequencing, the digested nucleotides are separated on low concentration (0.6 to 0.8%) agarose 
gels, fragments are excised, and agar digested with AGAR ACE (Promega). Extended clones are 
religated using T4 ligase (New England Biolabs, Inc., Beverly MA) into pUC 18 vector (Amersham 
Pharmacia Biotech), treated with Pfii DNA polymerase (Stratagene) to fill-in restriction site overhangs, 
and transfected into competent E. coli cells. Transformed cells are selected on antibiotic-containing 

10 media, individual colonies are picked and cultured overnight at 37°C in 384-well plates in LB/2x 
carbenicillin liquid media. 

The cells are lysed, and DNA is amplified by PCR using Taq DNA polymerase (Amersham 
Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1 : 
94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5; steps 2, 3, and 4 

is repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at 4°C. DNA is quantified by PICOGREEN 
reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reamplified 
using the same conditions as described above. Samples are diluted with 20% dimethysulfoxide (1 :2, 
v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC 
DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle 

2 o sequencing ready reaction kit (Applied Biosystems). 

In like manner, the mddt is used to obtain regulatory sequences (promoters, introns, and 
enhancers) using the procedure above, oligonucleotides designed for such extension, and an appropriate 
genomic library. 

2 5 IX. Labeling of Probes and Southern Hybridization Analyses 

Hybridization probes derived from the mddt of the Sequence Listing are employed for 
screening cDNAs, mRNAs, or genomic DNA. The labeling of probe nucleotides between 100 and 
1000 nucleotides in length is specifically described, but essentially the same procedure may be used 
with larger cDNA fragments. Probe sequences are labeled at room temperature for 30 minutes using a 

30 T4 polynucleotide kinase, y^-ATP, and 0.5X One-Phor-AU Plus (Amersham Pharmacia Biotech) 
buffo: and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech). The 
probe mixture is diluted to 10 7 dpm/ng/ml hybridization buffer and used in a typical membrane-based 
hybridization analysis. 

The DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed 
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through a 0.7% agarose gel. The DNA fragments are transferred from the agarose to nylon membrane 
(NYTRAN Plus, Schleicher & Schuell, Inc., Keene NH) using procedures specified by the 
manufacturer of the membrane. Prehybridization is carried out for three or more hours at 68 °C, and 
hybridization is carried out overnight at 68 °C. To remove non-specific signals, blots are sequentially 
5 washed at room temperature under increasingly stringent conditions, up to 0. lx saline sodium citrate 
(SSC) and 0.5 % sodium dodecyl sulfate. After the blots are placed in a PHOSPHORIMAGER cassette 
(Molecular Dynamics) or are exposed to autoradiography film, hybridization patterns of standard and 
experimental lanes are compared. Essentially the same procedure is employed when screening RNA. 

X. Chromosome Mapping of mddt 

The cDNA sequences which were used to assemble SEQ ID NO: 1-45 are compared with 
sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other 
implementations of the Smith-Waterman algorithm. Sequences from these databases that match SEQ 
ID NO: 1-45 are assembled into clusters of contiguous and overlapping sequences using assembly 
algorithms such as PHRAP (Table 7). Radiation hybrid and genetic mapping data available from 
public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for Genome 
Research (WIGR), and G6n6thon are used to determine if any of the clustered sequences have been 
previously mapped Inclusion of a mapped sequence in a cluster will result in the assignment of all 
sequences of that cluster, including its particular SEQ ID NO:, to that map location. The genetic map 
locations of SEQ ID NO: 1-45 are described as ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances 
are based on genetic markers mapped by G6n6thon which provide boundaries for radiation hybrid 
markers whose sequences wore included in each of the clusters. 

XI. Microarray Analysis 
Probe Preparation from Tissue or Ce ll Samples 

3 o Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 

polyA + RNA is purified using the oligo (dT) cellulose method Each polyA + RNA sample is reverse 
transcribed using MMLV reverse-transcriptase, 0.05 pg/|ul oligo-dT primer (21mer), IX first strand 
buffo:, 0.03 units/|nl RNase inhibitor, 500 \iM dATP, 500 nM dGTP, 500 jiM dTTP, 40 |iM dCTP, 
40 *iM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse transcription 
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reaction is performed in a 25 ml volume containing 200 ng polyA + RNA with GEMB RIGHT kits 
(Ihcyte). Specific control poIyA + RNAs are synthesized by in vitro transcription from non-coding yeast 
genomic DNA (W. Lei, unpublished). As quantitative controls, the control mRNAs at 0.002 ng, 0.02 
ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at ratios of 1:100,000, 1:10,000, 
1:1000, 1:100 (w/w) to sample mRNA respectively. The control mRNAs are diluted into reverse 
transcription reaction at ratios of 1:3, 3:1, 1:10, 10:1, 1:25, 25:1 (w/w) to sample mRNA differential 
expression patterns. After incubation at 37° C for 2 hr, each reaction sample (one with Cy3 and another 
with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 minutes at 
85° C to the stop the reaction and degrade the RNA. Probes are purified using two successive 
CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. (CLONTECH), Palo 
Alto CA) and after combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 
mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The probe is then dried to completion 
using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and resuspended in 14 jil 5X SSC/0.2% 
SDS. 

Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array element is 
amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR amplification uses 
primers complementary to the vector sequences flanking the cDNA insert. Array elements are 
amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 (ig. 
Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). 

Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
slides (Corning) are cleaned by ultrasound in 0. 1 % SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR Scientific 
Products Corporation (VWR), West Chester, PA), washed extensively in distilled water, and coated 
with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 1 10°C oven. 

Array elements are applied to the coated glass substrate using a procedure described in US 
Patent No. 5,807,522, incorporated herein by reference. 1 |il of the array element DNA, at an average 
concentration of 100 ng/jul, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 
Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60° C followed by washes in 0.2% 
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SDS and distilled water as before. 
Hybridization 

Hybridization reactions contain 9 |il of probe mixture consisting of 0.2 \ig each of Cy3 and 
5 Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The probe mixture 
is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and covered with an 1 .8 
cm 2 coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly larger 
than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 ill of 
5x SSC in a corner of the chamber. The chamber containing the arrays is incubated for about 6.5 
i o hours at 60° C. The arrays are washed for 10 min at 45° C in a first wash buffer (IX SSC, 0. 1 % SDS), 
three times for 10 minutes each at 45° C in a second wash buffer (0. IX SSC), and dried 

Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
is Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a 
2 o resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 
Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 
filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 

2 5 emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5 . Each array is 

typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra from both fluorophores simultaneously. 

The sensitivity of the scans is typically calibrated using the signal intensity generated by a 
cDNA control species added to the probe mix at a known concentratioa A specific location on the 

3 o array contains a complementary DNA sequence, allowing the intensity of the signal at that location to 

be correlated with a weight ratio of hybridizing species of 1 :100,000. When two probes from different 

sources (e.g., representing test and control cells), each labeled with a different fluorophore, 

are hybridized to a single array for the purpose of identifying genes that are differentially expressed, the 
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calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and adding 
identical amounts of each to the hybridization mixture. 

The output of the photomiiltiplier tube is digitized using a 12-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood, MA) installed in an IBM-compatible PC 
5 computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Whore two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission 
spectra) between the fluorophores using each fluorophore's emission spectrum. 
10 A grid is superimposed over the fluorescence signal image such that the signal from each spot 

is centered in each element of the grid. The fluorescence signal within each element is then integrated to 
obtain a numerical value corresponding to the average intensity of the signal. The software used for 
signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 

15 XII. Complementary Nucleic Acids 

Sequences complementary to the mddt are used to detect, decrease, or inhibit expression of the 
naturally occurring nucleotide. The use of oligonucleotides comprising from about 15 to 30 base pairs 
is typical in the art. However, smaller or larger sequence fragments can also be used. Appropriate 
oligonucleotides are designed from the mddt using OLIGO 4.06 software (National Biosciences) or 

2 o other appropriate programs and are synthesized using methods standard in the art or ordered from a 

commercial supplier. To inhibit transcription, a complementary oligonucleotide is designed from the 
most unique 5 ' sequence and used to prevent transcription factor binding to the promote sequence. To 
inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding and 
processing of the transcript. 

25 

XIII. Expression of MDDT 

Expression and purification of MDDT is accomplished using bacterial or virus-based 
expression systems. For expression of MDDT in bacteria, cDNA is subcloned into an appropriate 
vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of 

3 o cDNA transcription. Examples of such promoters include, but are not limited to, the trp-lac (tac) 

hybrid promoter and the T5 or 17 bacteriophage promoter in conjunction with the lac operator 
regulatory element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., 
BL21 (DE3). Antibiotic resistant bacteria express MDDT upon induction with isopropyl beta-D- 
thiogalactopyranoside (IPTG). Expression of MDDT in eukaryotic cells is achieved by infecting insect 
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or mammalian cell lines with recombinant Autographica calif ornica nuclear polyhedrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDN A encoding MDDT by either homologous recombination or bacterial-mediated 
transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 
polyhedrin promoter drives high levels of cDNA transcriptioa Recombinant baculovirus is used to 
infect Spodoptera frugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
Infection of the latter requires additional genetic modifications to baculovirus. (See e.g. , Engelhard, 
supra : and Sandig, supra .) 

In most expression systems, MDDT is synthesized as a fusion protein with, e.g M glutathione S- 
transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26-kilodalton 
enzyme from Schistosoma iaponicum . enables the purification of fusion proteins on immobilized 
glutathione undo: conditions that maintain protein activity and antigenicity (Amersham Pharmacia 
Biotech). Following purification, the GST moiety can be proteolytically cleaved from MDDT at 
specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity purification 
using commercially available monoclonal and polyclonal anti-FL AG antibodies (Eastman Kodak 
Company, Rochester NY). 6-His, a stretch of six consecutive histidine residues, enables purification on 
metal-chelate resins (QIAGEN). Methods for protein expression and purification are discussed in 
Ausubel (1995, supra . Chapters 10 and 16). Purified MDDT obtained by these methods can be used 
directly in the following activity assay. 

XIV. Demonstration of MDDT Activity 

MDDT, or biologically active fragments thereof, are labeled with 125 I Bolton-Hunter reagent. 
(See, e.g., Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539.) Candidate molecules 
previously arrayed in the wells of a multi-well plate are incubated with the labeled MDDT, washed, and 
any wells with labeled MDDT complex are assayed Data obtained using different concentrations of 
MDDT are used to calculate values for the number, affinity, and association of MDDT with the 
candidate molecules. 

Alternatively, molecules interacting with MDDT are analyzed using the yeast two-hybrid 
system as described in Fields, S. and 0. Song (1989) Nature 340:245-246, or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH). 

MDDT may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 
which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
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between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. Patent 
No. 6,057,101). 

XV. Functional Assays 
5 MDDT function is assessed by expressing mddt at physiologically elevated levels in 

mammalian cell culture systems. cDNA is subcloned into a mammalian expression vector containing a 
strong promoter that drives high levels of cDNA expressioa Vectors of choice include pCMV SPORT 
(Life Technologies) and pCR3.1 (Invitrogen Corporation, Carlsbad CA), both of which contain the 
cytomegalovirus promoter. 5-10 ng of recombinant vector are transiently transfected into a human cell 
10 line, preferably of endothelial or hematopoietic origin, using either liposome formulations or 

electroporatioa 1-2 fig of an additional plasmid containing sequences encoding a marker protein are 
co-transfected. 

Expression of a marker protean provides a means to distinguish transfected cells from 
nontransfected cells and is a reliable predictor of cDN A expression from the recombinant vector. 

15 Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; CLONTECH), CD64, or a 
CD64-GFP fusion protein. Flow cytometry (FCM), an automated laser optics-based technique, is used 
to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of the 
cells and other cellular properties. 

FCM detects and quantifies the uptake of fluorescent molecules that diagnose events preceding 

20 or coincident with cell death. These events include changes in nuclear DN A content as measured by 
staining of DNA with propidium iodide; changes in cell size and granularity as measured by forward 
light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as measured by 
decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and intracellular 
proteins as measured by reactivity with specific antibodies; and alterations in plasma membrane 

2 5 composition as measured by the binding of fluorescein-conjugated Annexin V protein to the cell 

surface. Methods in flow cytometry are discussed in Qrmerod, M. G. (1994) Flow Cytometry . Oxford, 
New York NY. 

The influence of MDDT on gene expression can be assessed using highly purified populations 
of cells transfected with sequences encoding MDDT and either CD64 or CD64-GFP. CD64 and 

3 o CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 

immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake Success 
NY). mRNA can be purified from the cells using methods well known by those of skill in the art. 
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Expression of mRNA encoding MDDT and otto genes of interest can be analyzed by northern analysis 
or microarray techniques. 

XVI. Production of Antibodies 

5 MDDT substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 

Harrington, M.G. (1990) Methods EnzymoL 182:488-495), or other purification techniques, is used to 
immunize rabbits and to produce antibodies using standard protocols. 

Alternatively, the MDDT amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is synthesized 

10 and used to raise antibodies by means known to those of skill in the art Methods for selection of 

appropriate epitopes, such as those near the C-tenninus or in hydrophilic regions are well described in 
the art (See, e.g., Ausubel, 1995, supra . Chapter 11.) 

Typically, peptides 15 residues in length are synthesized using an ABI 431 A peptide 
synthesizer (Applied Biosystems) using finoc-chemistry and coupled to KLH (Sigma) by reaction with 

15 N-maleiinidobenzoyl-N-hydroxysuccinimid^ ester (MBS) to increase immunogenicity. (See, e.g„ 
Ausubel, supra .) Rabbits are immunized with the peptide-KLH complex in complete Freund's 
adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to 
plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio- 
iodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-MDDT. activity 

2 o using protocols well known in the art, including ELISA, RIA, and immunoblotting. 

XVII. Purification of Naturally Occurring MDDT Using Specific Antibodies 
Naturally occurring or recombinant MDDT is substantially purified by immunoaffinity 

chromatography using antibodies specific for MDDT. An immunoaffinity column is constructed by 

2 5 covalenfly coupling anti-MDDT antibody to an activated chromatographic resin, such as 

CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the resin is 
blocked and washed according to the manufacturer's instructions. 

Media containing MDDT are passed over the immunoaffinity column, and the column is 
washed under conditions that allow the preferential absorbance of MDDT (e.g., high ionic strength 

3 o buffers in the presence of detergent). The column is eluted under conditions that disrupt 

antibody/MDDT binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and MDDT is collected. 
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All publications and patents mentioned in the above specification are herein incorporated by 
reference. Various modifications and variations of the described method and system of the invention 
will be apparent to those skilled in the art without departing from the scope and spirit of the invention. 
Although the invention has been described in connection with specific preferred embodiments, it should 
be understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Indeed, various modifications of the above-described modes for carrying out the invention which are 
obvious to those skilled in the field of molecular biology or related fields are intended to be within the 
scope of the following claims. 
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TABLE 3 








SEQ ID NO: Template ID 


Start 


Stop 


Frame 


Domain Topology 












TVDG 




1 

i 


LG-9776B3 1 '2000FEB 1 8 


373 


459 


forward 1 


TM 


N in 


i 
i 


LG-9776B3 V2000FEB18 


657 


731 


forward 3 


TM 


N out 


2 

mm 


LG-893050 V2000FEB18 

WW W WW'W* ■ «S»WWW» 1 I. r 1 V 


15 


101 


forward 3 


TM 


N out 


3 


LG'9801 53.1 :2000FEB1 8 

hWltvww ■ WW. • «^wwwi w>a^ • W 


313 


375 


forward 1 


TM 


N out 


3 


LG'9801 53.1 :2000FEB18 


391 


453 


forward 1 


TM 


N out 


3 


LG'9801 53.1 :2000FEB1 8 


278 


364 


forward 2 


TM 


N out 


3 


LG:9801 53.1 :2000FEB1 8 


416 


493 


forward 2 


TM 


N out 


3 


LG:9801 53.1 :2000FEB1 8 


809 


871 


forward 2 


TM 


N out 


3 


LG:9801 53.1 :2000FEB1 8 


902 


964 


forward 2 


TM 


N out 


3 


LG:980153.1:2000FEB18 


1181 


1264 


forward 2 


TM 


N out 


3 


L G"9801 53.1 :2000FEB1 8 


1427 


1510 


forward 2 


TM 


N out 


3 


LG'9801 53.1 :2000FEB1 8 

kWtW WW ■ WW. 1 .teWWWI IwW I W 


1733 


1798 


forward 2 


TM 


N out 


3 


LG'9801 53.1 :2000FEB1 8 

^»^^|» www ■ W?w* • *^m\J vwl fcmfa^ | 


1868 


1954 


forward 2 


TM 


N out 


3 


LG-9801 53.1 :2000FEB18 

W .WWW ■ WW. 1 «W WWI I W 


2141 


2227 


forward 2 


TM 


N out 


3 


LG'980153 V2000FEB18 

ktV^I . WWW 1 WW. I .fa*W WW 1 1 L t ■ w 


2261 


2308 


forward 2 


TM 


N out 


3 


LG*980153 V2000FEB18 

l«M .WWW I WW. • .fcWWWI !■ mat 1 W 


60 


125 


forward 3 


TM 


N in 


3 


LG'9801 53 1 '2000FEB1 8 

kW« WWW ■ WW* t .C^WWWI WW 1 w 


402 


476 


forward 3 


TM 


N in 


3 


LG-9801 53 1-2000FEB18 

I^VaM* WW I <WW^« t »Ga»W^W^vJ 1 W 7 


2031 


2081 


forward 3 


TM 


N in 


3 


LG'980153 V2000FEB18 

hMl WWW 1 WW. i .fciWWWI ^mmmf 1 W 


2142 


2213 


forward 3 


TM 


N in 


5 


LG:475551 .1 :2000F,EB18 


2134 


2208 


forward 1 


TM 


N in 


5 


LG'475551 .1 :2000FEB18 

Wmwm\ m tMm^T 9 WWW 7 1 • • *^»wWf ^^k^ • W 7 


2039 


2125 


forward 2 


TM 


N out 


5 


LG'475551 1:2000FEB18 


1167 


1217 


forward 3 


TM 


N in 


6 


LG'481 407.2:2000FEB1 8 


874 


927 


forward 1 


TM 




6 


LG'481 407 2-2000FEB18 


949 


1035 


forward 1 


TM 




6 


LG'481 407 2'2000FEB18 

lwWi.~W I 1^ W f «b>fc»WWWI 1 w 


1081 


1161 


forward 1 


TM 




6 


LG:481 407.2:2000FEB1 8 


1510 


1584 


forward 1 


TM 




6 


LG'481 407.2:2000FEB1 8 


1355 


1435 


forward 2 


TM 


N out 


6 


LG'481 407 2:2000FEB18 


1439 


1525 


forward 2 


TM 


N out 


6 


LG'481 407.2:2000FEB1 8 

k«Wt * V ■ W 7 • •^■•**wwWl 1^ fc* 1 W 


1326 


1409 


forward 3 


TM 


N in 


6 


LG:481 407.2:2000FEB1 8 


1446 


1526 


forward 3 


TM 


N in 


6 


LG:481407.2:2000FEB18 


1545 


1616 


forward 3 


TM 


N in 


7 


Ll:443580.1 :2000FEB01 


488 


574 


forward 2 


TM 


N out 


10 


LG: 171 377.1 :2000MAY1 9 


318 


386 


forward 3 


TM 


N in 


10 


LG : 1 71 377. 1 :2000MAY1 9 


549 


635 


forward 3 


TM 


Nin 


10 


LG: 1 71 377.1 :2000MAY1 9 


669 


740 


forward 3 


TM 


N in 


12 


LG:247384.1 :2000MAY1 9 


1381 


1461 


forward 1 


TM 


Nin 


12 


LG:247384.1 :2000MAY1 9 


1624 


1710 


forward 1 


TM 


Nin 


12 


LG:247384.1 :2000MAY1 9 


1409 


1495 


forward 2 


TM 


N in 


12 


LG:247384.1 :2000MAY1 9 


1395 


1481 


forward 3 


TM 


Nin 


12 


LG:247384.1 :2000MAY1 9 


1617 


1679 


forward 3 


TM 


N in 


13 


LG:403872.1 :2000MAY1 9 


535 


621 


forward 1 


TM 


Nin 


13 


LG:403872.1 :2000MAY1 9 


1360 


1446 


forward 1 


TM 


Nin 


13 


LG.403872.1 :2000MAY1 9 


1522 


1581 


forward 1 


TM 


Nin 


13 


LG:403872.1 :2000MAY1 9 


1828 


1902 


forward 1 


TM 


Nin 


13 


LG.403872.1 :2000MAY1 9 


1957 


2022 


forward 1 


TM 


Nin 


13 


LG:403872.1 :2000MAY1 9 


299 


349 


forward 2 


TM 


Nin 


13 


L&403872.1 2000MAY19 


1361 


1423 


forward 2 


TM 


Nin 


13 


LG:403872.1 2000MAY1 9 


1439 


1501 


forward 2 


TM 


Nin 


13 


LG:403872.1 :2000MAY1 9 


1553 


1627 


forward 2 


TM 


Nin 


13 


LG:403872.1 :2000MAY19 


1859 


1918 


forward 2 


TM 


Nin 


13 


LG:403872.1:2000MAY19 


2027 


2110 


forward 2 


TM 


Nin 


13 


LG:403872.1 :2000MAY1 9 


2117 


2203 


forward 2 


TM 


Nin 


13 


LG.403872.1 :2000MAY19 


369 


452 


forward 3 


TM 


Nin 



j "" 62 ' " 
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13 


LG:403872.1 :2000MAY1 9 


549 


635 


forward 3 


TM 

1 IVI 


N in 

IN III 


13 


LG:403872.1 :2000MAY1 9 


708 


785 


forward 3 

IWI IIUIU w 


TM 

1 IVI 


N in 
1 1 ii i 


13 


LG:403872.1 :2000MAY1 9 


1101 


1187 


forward 3 


TM 

1 IVI 


N in 

IN 111 


13 


LG:403872. 1 :2000MAY1 9 


1419 


1505 


forward 3 


TM 

1 IVf 


N in 

IN III 


13 


LG:403872.1 5000MAY1 9 


1575 


1661 


forward 3 

IVI flHlVJ W 


TM 

1 IVI 


N in 

IN II 1 


13 


LG:403872.1 :2000MAY1 9 


2115 


2192 


forward 3 

IWI WCll u w 


TM 

1 IVI 


Nin 


13 


LG:403872.1 :2000MAY1 9 


2226 


2273 


forward 3 

IVI IIUIU W 


TM 

1 IVI 


N in 

IN III 


14 


LG:1 1 3521 3.1 :2000MAY1 9 


41 


127 


forward 2 

IWI TTQIU Cm 


TM 

1 IVI 


N out 

IN UUl 


14 


LG: 1 1 3521 3. 1 :2000MAY1 9 


215 


274 


forward 2 

IWI IIUlU Cm 


TM 

1 IVI 


N nut 
IN UUl 


14 


LG:1 1 3521 3.1 :2000MAY1 9 


293 


379 


forward 2 


TM 

1 IVI 


Nl nut 

IN wU I 


14 


LG: 1 1 3521 3.1 :2000MAY1 9 


389 


475 


forward 2 


TM 

1 IVI 


N nut 

IN UUl 


16 


LG:3421 47.1 :2000MAY1 9 


142 


204 


forward 1 

• Wl I1HI W 1 


TM 

1 IVI 


N nut 

IN UUl 


16 


LG:3421 47.1 :2000MAY1 9 


171 


251 


forward 3 


TM 

1 IVI 


M mrt 

IN UUl 


17 


LG:1 097300.1 :2000MAY1 9 


487 


564 


forward 1 


TM 

1 IVI 


17 


LG:1 097300.1 :2000MAY1 9 


805 


891 


forward 1 

IWI V vol w. • 


TM 

1 IVI 




17 


LG:1 097300.1 :2000MAY1 9 


1372 


1458 


forward 1 

1 Wl WCll u | 


TM 

1 IVI 




17 


LG:1 097300. 1 :2000MAY1 9 


668 


754 


forward 2 

IWI VValU Cm 


TM 

1 IVI 


hi nut 
IN UUl 


17 


LG:1 097300.1 :2000MAY1 9 


803 


874 


forward 2 

iwi vvaiu Cm 


TM 

1 IVI 


M nut 
IN OUl 


17 


LG:1 097300.1 :2000MAY1 9 


1358 


1441 


forward 0 

IWI WUIU Cm 


TM 

1 IVI 


IN OUl 


17 


LG:1 097300.1 :2000MAY19 


522 


578 


forward 3 

IWI WCll u w 


TM 

1 IVI 


Nl in 
In In 


17 


LG:1 097300.1 :2000MAY1 9 


750 


836 


forward 3 

1 Wl WCll u w 


TM 

1 IVI 


M in 
IN in 


17 


LG:1 097300.1 :2000MAY1 9 


894 


956 


forward 3 

IWI WCll u w 


TM 

1 IVI 


M in 

In in 


17 


LG:1 097300.1 :2000MAY1 9 


1068 


1145 


forward 3 

iwi wen u w 


TM 

1 IVI 


M in 

in in 


18 


LG:444850.9:2000MAY1 9 


253 


315 


forward 1 

IWI WCll W 1 


TM 

1 IVI 


M in 

IN III 


19 


LG:402231 .6:2000MAY1 9 


407 


484 


forward 2 

IWI WCll U Cm 


TM 

1 IVI 


N in 
in in 


23 


LG:350793.2:2000MAY1 9 


148 


222 


forward 1 

iwi wcii u • 


TM 

1 IVI 


M in 
in in 


23 


LG:350793.2:2000MAY1 9 


316 


384 


forward 1 

IWI WCll VJ 1 


TM 

1 IVI 


M in 
in in 


23 


LG:350793.2:2000MAY1 9 


1144 


1215 


forward 1 

IWI WCll VI 1 


TM 

1 IVI 


M in 
In in 


23 


LG:350793.2:2000MAY1 9 


1231 


1293 


forward 1 

IWI WCll vl 1 


TM 

1 IVI 


M in 
In in 


23 


LG:350793.2:2000MAY1 9 


1339 


1425 


forward 1 

IWI WCll W 1 


TM 

1 IVI 


M in 
In in 


23 


LG:350793.2:2000MAY1 9 


1459 


1521 


forward 1 

IWI WCll w 1 


TM 

1 IV| 


M in 

in in 


23 


LG:350793.2:2000MAY1 9 


1582 


1662 


forward 1 

IWI WCll u 1 


TM 

1 IVI 


M in 
IN in 


23 


LG:350793.2:2000MAY1 9 


1882 


1953 


forward 1 

IWI WCll W 1 


TM 

1 IVI 


M in 

IN 111 


23 


LG:350793.2:2000MAY1 9 


1514 


1600 


forward 2 

IWI WCll U Cm 


TM 

1 IVI 




23 


LG:350793.2:2000MAY1 9 


2135 


2221 


forward 2 

IVIIIUIU Cm 


TM 

1 IVI 




23 


LG:350793.2:2000MA Y1 9 


1422 


1493 


forward 3 

IVIIIUIU W 


TM 
■ ivi 




23 


LG:350793.2:2000MAY1 9 


2268 


2354 


forward 3 

IWI IIUIU w 


TM 

1 IVI 




24 


LG:408751 .3:2000MAY1 9 


1202 


1264 


forward 2 

Ivlll HIU 


TM 

1 IVI 


M nut 

IN UUl 


24 


LG:408751 .3:2000MAY1 9 


1137 


1223 


forward 3 

iwi iiuiu w 


TM 
i ivi 


N in 

IN II 1 


25 


Ll:3361 20.1 :2000MAY01 


241 


297 


forward 1 

IWI WCll U 1 


TM 

1 IVI 


M in 

in in 


25 


LI:336120.1:2000MAY01 


616 


702 


forward 1 

■ Wl IIUIU ■ 


TM 

1 IVI 


M in 

IN II 1 


25 


LI:336120.1:2000MAY01 


1141 


1200 


forward 1 

IWI IIUI u ■ 


TM 

1 IVI 


M in 

IN IN 


25 


LI:336120.1:2000MAY01 


2524 


2598 


forward 1 


TM 

1 IVI 


N in 

IN 11 1 


25 


Ll:336120.1 :2000MAY01 


1163 


1213 


forward 2 


TM 


N in 
• i ii i 


25 


LI:336120.1:2000MAY01 


1922 


1972 


forward 2 


TM 


N in 

1 H 11 1 


25 


LI:336120.1:2000MAY01 


2060 


2119 


forward 2 


TM 


N in 


25 


LI:336120.1:2000MAY01 


2510 


2596 


forward 2 


TM 


N in 


25 


LI:336120.1:2000MAY01 


663 


749 


forward 3 


TM 


N in 
I* in 


25 


LI:336120.1:2000MAY01 


1380 


1445 


forward 3 


TM 


Nin 


25 


LI:336120.1:2000MAY01 


1839 


1925 


forward 3 


TM 


Nin 


25 


LI:336120.1:2000MAY01 


2148 


2234 


forward 3 


TM 


Nin 


25 


LI:336120.1:2000MAY01 


2418 


2471 


forward 3 


TM 


Nin 


25 


LI:336120.1:2000MAY01 


2499 


2585 


forward 3 


TM 


Nin 


26 


Ll:2341 04.2:2000MAY01 


1873 


1947 


forward 1 


TM 


Nout 


26 


Ll:2341 04.2:2000MAY01 


2155 


2241 


forward 1 


TM 


Nout 


26 


U:2341 04.2:2000MAY01 


3616 


3690 


forward 1 


TM 


Nout 
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26 


Ll:2341 04.2:2000MAY01 


1112 


1168 


forward 2 


TM 


N in 


26 


Ll:2341 04.2:2000MAY01 


2216 


2302 


forward 2 


TM 


N in 


26 


Ll:2341 04.2:2000MAY01 


3632 


3718 


forward 2 


TM 


N in 


26 


Ll:2341 04.2:2000MAY01 


3998 


4045 


forward 2 


TM 


N in 


26 


Ll:2341 04.2:2000MAY01 


1314 


1400 


forward 3 


TM 


Nin 


26 


Ll:2341 04.2:2000MAY01 


2172 


2258 


forward 3 


TM 


Nin 


26 


Ll:2341 04.2:2000MAY01 


2607 


2684 


forward 3 


TM 


N in 


26 


LI:234104.2:2000MAY01 


2739 


2798 


forward 3 


TM 


Nin 


26 


Ll:2341 04.2:2000MAY01 


2841 


2891 


forward 3 


TM 


Nin 


26 


Ll:2341 04.2-.2000MAY01 


3621 


3707 


forward 3 


TM 


Nin 


26 


LU2341 04.2:2000MAY01 


4080 


4145 


forward 3 


TM 


Nin 


28 


Ll:1 19992.3:2000MAY01 


22 


102 


forward 1 


TM 


Nout 


28 


LI:119992.3:2000MAY01 


151 


237 


forward 1 


TM 


Nout 


28 


LI:119992.3:2000MAY01 


1444 


1530 


forward 1 


TM 


N out 


28 


LI:119992.3:2000MAY01 


1603 


1683 


forward 1 


TM 


Nout 


28 


LI:119992.3:2000MAY01 


1729 


1809 


forward 1 


TM 


Nout 


28 


LI:119992.3:2000MAY01 


2197 


2253 


forward 1 


TM 


N out 


28 


LI:119992.3:2000MAY01 


2269 


2355 


forward 1 


TM 


N out 


28 


Ll:1 1 9992.3:2000M AY01 


2989 


3075 


forward 1 


TM 


N out 


28 


LI:119992.3:2000MAY01 


3163 


3249 


forward 1 


TM 


N out 


28 


Ll:1 19992.3:2000MAY01 


1247 


1333 


forward 2 


TM 


N in 


28 


1.1:1 1 9992.3:2000MAY01 


1538 


1606 


forward 2 


TM 


N in 


28 


LI:119992.3:2000MAY01 


2207 


2293 


forward 2 


TM 


N in 


28 


LI:119992.3:2000MAY01 


2756 


2812 


forward 2 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


3098 


3169 


forward 2 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


3281 


3343 


forward 2 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


3356 


3418 


forward 2 


TM 


Nin 


28 


Ll:1 1 9992.3:2000MAY01 


120 


188 


forward 3 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


627 


689 


forward 3 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


708 


770 


forward 3 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


1425 


1511 


forward 3 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


1782 


1868 


forward 3 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


2223 


2306 


forward 3 


TM 


Nin 


28 


Ll:1 1 9992.3:2000MAY01 


2757 


2843 


forward 3 


TM 


N in 


28 


LI:119992.3:2000MAY01 


3027 


3113 


forward 3 


TM 


Nin 


28 


LI:119992.3:2000MAY01 


3213 


3275 


forward 3 


TM 


Nin 


28 


Ll:1 1 9992.3:2000MAY01 


3312 


3374 


forward 3 


TM 


Nin 


29 


Ll:1 97241 .2:2000MAY01 


289 


369 


forward 1 


TM 


Nout 


29 


U:1 97241 .2:2000MAY01 


430 


507 


forward 1 


TM 


Nout 


29 


Ll:1 97241 .2:2000MAY01 


799 


861 


forward 1 


TM 


Nout 


29 


U:1 97241 .2:2000MAY01 


889 


951 


forward 1 


TM 


Nout 


29 


Ll:1 97241 .2:2000MAY01 


1798 


1863 


forward 1 


TM 


Nout 


29 


Ll:1 97241 .2:2000MAY01 


1930 


2016 


forward 1 


TM 


Nout 


29 


U:1 97241 .2:2000MAY01 


2101 


2148 


forward 1 


TM 


Nout 


29 


U:1 97241 .2:2000MAY01 


2206 


2262 


forward 1 


TM 


Nout 


29 


Ll:1 97241 .2:2000MAY01 


416 


499 


forward 2 


TM 


Nout 


29 


Ll:1 97241 .2:2000MAY01 


812 


862 


forward 2 


TM 


Nout 


29 


U:1 97241 .2:2000MAY01 


1226 


1309 


forward 2 


TM 


Nout 


29 


Ll:1 97241 .2:2000MAY01 


1475 


1558 


forward 2 


TM 


Nout 


29 


LI:197241.2:2000MAY01 


2210 


2296 


forward 2 


TM 


Nout 


29 


U:1 97241 .2:2000MAY01 


60 


125 


forward 3 


TM 


Nin 


29 


Ll:1 97241 .2:2000MAY01 


333 


395 


forward 3 


TM 


Nin 


29 


U:1 97241 .2:2000MAY01 


441 


503 


forward 3 


TM 


Nin 


29 


Ll:1 97241 .2:2000MAY01 


2223 


2300 


forward 3 


TM 


Nin 


31 


LI: 1 42384.1 :2000MAY01 


367 


432 


forward 1 


TM 


Nout 


31 


Ll:1 42384.1 2000MAY01 


93 


155 


forward 3 


TM 


Nout 
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19 


LI -895427 1'2000MAY01 


1796 


1879 


fnrwa rri 2 


TM 


IM in 
ii ii i 


32 

Wfc 


Li'895427 V2000MAY01 

L>i . wc w~t— * ■ i ibwuiiin i w i 


1656 


1724 


forward 3 

• VI ilGII W, W 


TM 


N in 

IV II 1 


33 

WW 


Ll'757439 1'2000MAY01 


253 

»— w W 


312 

W 1 Cm 


forward 1 

IWIWCMU 1 


TM 

1 IVI 


N in 

IV II 1 


33 


Ll'757439 V2000MAY01 

LI • / w "tww» l tfcvwwiiin i w I 


817 


900 


forward 1 

IWI » » Ql vJ I 


TM 


N in 


33 


Ll'757439.1 :2000MAY01 


1507 


1572 


forward 1 


TM 


N in 


33 


Ll:757439.1 :2000MAY01 


1615 


1677 


forward 1 


TM 


N in 


33 


LL757439.1 :2000MAY01 


1696 


1758 


forward 1 


TM 


N in 


33 


Ll:757439.1 :2000MAY01 


1834 


1899 


forward 1 


TM 


N in 


33 


Li.757439.1 :2000MAY01 


1969 


2043 


forward 1 


TM 


N in 


33 


Ll:757439.1 :2000MAY01 


2107 


2193 


forward 1 


TM 


N in 


33 


LI:757439.1 :2000MAY01 


2506 


2586 


forward 1 


TM 


N in 


33 


Ll:757439.1 :2000MAY01 


815 


901 


forward 2 


TM 


N out 


33 


Ll:757439.1 :2000MAY01 


1634 


1720 


forward 2 


TM 


N out 


33 


Ll:757439.1 :2000MAY01 


1796 


1882 


forward 2 


TM 


N out 


33 


LL'757439.1 :2000MAY01 


1952 


2026 


forward 2 


TM 


N out 


33 


LL757439.1 :2000MAY01 


2486 


2563 


forward 2 

■ \tr 9 ffUl W mm 


TM 


N out 


33 


Ll'757439 1:2000MAY01 


783 


869 


forward 3 

iwi nuiu w 


TM 


N in 


33 


LI757439 1 2000MAY01 

Uli • Wf w » 1 • h»WWwlll*t 1 W 1 


996 


1049 


forward 3 

iwiwgiiw w 


TM 


N in 


33 

WW 


Ll'757439 1'2000MAY01 

bli # ^^WW« I »^»WWWtll/» f W 1 


1545 


1631 


forward 3 

IWI vwtstt w w 


TM 


N in 


33 


Ll'757439 V2000MAY01 

li » i w * "tww • i ■tuvuiiin i w 1 


2115 


2174 


forward 3 

iwi vvai vj w 


TM 

1 IVI 


N in 

IV II 1 


35 

WW 


Ll'243660 4-2000MAY01 


1247 


1333 

1 WWW 


forward 2 

IWI VVCtl V Cm 


TM 

1 IVI 


N in 

IV II 1 


36 

WW 


Ll'334386 1 '200QMAY01 


538 

www 


621 

vL 1 


forward 1 

IWI WCM \J 1 


TM 

1 IVI 




36 

WW 


Ll'334386 1'2000MAY01 


922 


1008 

1 www 


forward 1 

IWI WCll \J 1 


TM 




36 

WW 


Ll'334386 1 "2000MAY01 

Uli WvTU W V . I tbUvwIVIil IVI 


1087 

1 Wf 


1173 

1 1 # w 


forward 1 

IWI ITUl \J 1 


TM 

1 IVI 




36 


Ll'334386 V2000MAY01 


1468 


1530 


forward 1 

IWI ITUl WJ 1 


TM 

1 IVI 




36 


Ll'334386 1 '2000MAY01 


1570 


1632 


forward 1 

IWI ilQI WJ 1 


TM 

1 IVI 




36 


Ll'334386 1.2000MAY01 

Mtl'WW *WWW* 1 (ttVVUIIII » • V 1 


2731 


2802 


forward 1 

iwi v v a 1 wj i 


TM 

1 IVI 




36 


Ll:334386.1 :2000MAY01 


2992 


3054 


forward 1 

IWI IIUI W 1 


TM 




36 


Ll:334386.1 :2000MAY01 


3325 


3387 


forward 1 

• wi iruiu ■ 


TM 




36 


Ll:334386.1 :2000MAY01 


3406 


3468 


forward 1 

IWI IIUIU 1 


TM 




36 


Ll'334386 1 '2000MAY01 


3487 


3570 


forward 1 

IWI VVGkl V 1 


1 IV| 




36 


LI334386 1 '2000MAY01 


3766 


3852 


forward 1 

IWI IIUIU 1 


TM 




36 


Ll'334386 1 '2000MAY01 

Uliww~wWwi I •t»WwWlwl#l 1 W 1 


4006 


4077 


forward 1 

IWI VI Gil W> 1 


TM 




36 


Ll'334386 1 '2000MAY01 

kiiww^wwwi i luvwifin ■ \nf i 


4342 


4416 


forward 1 

■ WI IIUIU I 


TM 




36 


Ll:334386 1 '2000MAY01 


4615 


4686 


forward 1 

• WI IIUIU 1 


TM 




36 


Ll:334386.1 :2000MAY01 


4747 


4833 


forward 1 

M^nWW If Ml W V 


TM 




36 


Ll:334386.1 :2000MAY01 


5062 


5124 


forward 1 


TM 




36 


Ll:334386.1 :2000MAY01 


5140 


5202 


forward 1 


TM 




36 


Ll:334386.1 :2000MAY01 


5227 


5289 


forward 1 

■ ? f W>l W I 


TM 




36 


Ll:334386.1 :2000MAY01 

mm ■ + \nf\9w ¥ 1 «t»WVVIV If » • *Jr ■ 


5563 


5649 


forward 1 


TM 




36 


Ll:334386.1 :2000MAY01 


1235 


1321 


forward 2 


TM 


N in 


36 


LI:334386.1 :2000MAY01 


2423 


2476 


forward 2 

• WlllVfclW &» 


TM 


N in 


36 


LU334386.1 '2000MAY01 

l»»».WW^WWWt • .fcWWWtll/* ■ W 1 


2702 


2764 


forward 2 

• WI IIUIU *m* 


TM 


N in 


36 


LI.334386 1 '2000MAY01 


2792 


2854 


forward 2 

IWI IIMIW w 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 

i«l t WW iWWWi ■ tisV Wtllfi ■ \/ ■ 


3086 


3172 


forward 2 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


3302 


3355 


forward 2 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


3452 


3517 


forward 2 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


3920 


4006 


forward 2 


TM 


N in 


36 


Ll:334386.1 :2000M AY01 


4064 


4144 


forward 2 


TM 


Nin 


36 


Ll:334386.1 :2000MAY01 


4250 


4318 


forward 2 


TM 


Nin 


36 


LI.-334386.1 :2000MAY01 


4331 


4402 


forward 2 


TM 


Nin 


36 


Ll:334386.1 :2000MAY01 


4523 


4576 


forward 2 


TM 


Nin 


36 


Ll:334386.1 :2000MAY01 


4586 


4669 


forward 2 


TM 


Nin 


36 


LI:334386.1:2000MAY01 


4772 


4855 


forward 2 


TM 


Nin 


36 


LI:334386.1:2000MAY01 


5039 5125 


forward 2 


TM 


Nin 


36 


LI:334386.1:2000MAY01 


5498 


5584 


forward 2 


TM 


Nin 
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36 


Ll:334386.1 :2000MAY01 


30 


116 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


324 


380 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


387 


470 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


531 


608 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


1362 


1448 


forward 3 


TM 


N in 


36 


1.1:334386.1 :2000MAY01 


1539 


1625 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


2232 


2279 


forward 3 


TM 


Nin 


36 


Ll:334386.1 :2000MAY01 


2580 


2651 


forward 3 


TM 


N in 


36 


Ll:334386.1 .2000MAY01 


2757 


2822 


forward 3 


TM 


Nin 


36 


Ll:334386.1 :2000MAY01 


2820 


2870 


forward 3 


TM 


N in 


36 


LI:334386.1:2000MAY01 


3282 


3368 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


3510 


3596 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


3981 


4064 


forward 3 


TM 


N in 


36 


LI:334386.1:2000MAY01 


4356 


4427 


forward 3 


TM 


N in 


36 


LI.-334386.1 :2000MAYQ1 


4464 


4544 


forward 3 


TM 


N in 


36 


LI:334386.1:2000MAY01 


4959 


5024 


forward 3 


TM 


N in 


36 


Ll:334386.1 :2000MAY01 


5601 


5687 


forward 3 


TM 


N in 


37 


LI:347572.1:2000MAY01 


790 


876 


forward 1 


TM 


N in 


37 


Ll:347572.1 :2000MAY01 


1354 


1434 


forward 1 


TM 


N in 


37 


Ll:347572.1 :2000MAY01 


2425 


2511 


forward 1 


TM 


N in 


37 


LI:347572.1:2000MAY01 


2599 


2685 


forward 1 


TM 


N in 


37 


Ll:347572.1 :2000MAY01 


2686 


2757 


forward 1 


TM 


N in 


37 


Ll:347572.1 :2000MAY01 


3133 


3207 


forward 1 


TM 


N in 


37 


Ll:347572.1 :2000MAY01 


1184 


1255 


forward 2 


TM 




37 


Ll:347572.1 :2000MAY01 


2264 


2350 


forward 2 


TM 




37 


LU347572.1 :2000MAY01 


2597 


2665 


forward 2 


TM 




37 


LI:347572.1:2000MAY01 


2942 


3028 


forward 2 


TM 




37 


Ll:347572.1 :2000MAY01 


3137 


3199 


forward 2 


TM 




37 


LI:347572.1:2000MAY01 


3227 


3289 


forward 2 


TM 




37 


LI:347572.1:2000MAY01 


129 


215 


forward 3 


TM 


N in 


37 


Ll:347572.1 :2000MAY01 


969 


1046 


forward 3 


TM 


N in 


37 


LI:347572.1:2000MAY01 


1947 


2033 


forward 3 


TM 


N in 


37 


LI:347572.1:2000MAY01 


2208 


2288 


forward 3 


TM 


N in 


37 


LI:347572.1:2000MAY01 


2412 


2477 


forward 3 


TM 


N in 


37 


LI:347572.1:2000MAY01 


2604 


2684 


forward 3 


TM 


N in 


37 


LI:347572.1:2000MAY01 


2739 


2795 


forward 3 


TM 


N in 


38 


LI:817314.1:2000MAY01 


460 


546 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


1192 


1278 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


1318 


1386 


forward 1 


TM 




38 


Ll.817314.1 :2O00MAY01 


1423 


1485 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


1537 


1599 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


1630 


1692 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


1756 


1842 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


1930 


1992 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


2032 


2094 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


2860 


2946 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


3127 


3213 


forward 1 


TM 




38 


LI:817314.1:2000MAY01 


362 


448 


forward 2 


TM 


N in 


38 


LI:817314.1:2000MAY01 


3158 


3244 


forward 2 


TM 


N in 


38 


LI:817314.1:2000MAY01 


30 


95 


forward 3 


TM 


Nout 


38 


U:817314.1:2000MAY01 


1239 


1301 


forward 3 


TM 


N out 


38 


LI:817314.1:2000MAY01 


1785 


1865 


forward 3 


TM 


Nout 


38 


LI:817314.1:2000MAY01 


1920 


2000 


forward 3 


TM 


Nout 


38 


1-1:817314.1 2O00MAY01 


3189 


3269 


forward 3 


TM 


Nout 


39 


LI:000290.1:2000MAY01 


1003 


1065 


forward 1 


TM 


Nin 


39 


LI:000290.1:2000MAY01 


1075 


1137 


forward 1 


TM 


Nin 
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39 


LI:000290.1:2000MAY01 


1195 


1248 


forward 1 


TM 


Nin 


39 


Ll:000290.1 :2000MAY01 


767 


844 


forward 2 


TM 




39 


Ll:000290.1 :2000MAY01 


882 


932 


forward 3 


TM 


Nin 


40 


Ll:02351 8.3:2000M AY01 


28 


108 


forward 1 


TM 


Nout 


40. 


Ll:02351 8.3:2000MAY01 


20 


106 


forward 2 


TM 


N in 


41 


LI:1084246.1:2000MAY01 


178 


264 


forward 1 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


2686 


2760 


forward 1 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


2932 


3003 


forward 1 


TM 


N out 


41 


LI:1084246.1:2000MAY01 


3097 


3159 


forward 1 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


3184 


3246 


forward 1 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


3352 


3405 


forward 1 


TM 


Nout 


41 


LI:1084246.1:2000MAY01 


3409 


3480 


forward 1 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


3526 


3609 


forward 1 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


200 


253 


forward 2 


TM 


Nin 


41 


Ll:1084246.1 :2000MAY01 


2171 


2254 


forward 2 


TM 


Nin 


41 


Ll:1 084246.1 :2000MAY01 


2654 


2734 


forward 2 


TM 


■ Nin 


41 


Ll:1 084246.1 :2000MAY01 


3065 


3142 


forward 2 


TM 


Nin 


41 


Ll:1084246.1 :2000MAY01 


3284 


3358 


forward 2 


TM 


Nin 


41 


Ll:1084246.1 :2000MAY01 


3479 


3553 


forward 2 


TM 


Nin 


41 


U:1 084246.1 :2000MAY01 


582 


641 


forward 3 


TM 


Nout 


41 


Ll:1084246.1 :2000MAY01 


2127 


2213 


forward 3 


TM 


Nout 


41 


LI:1084246.1:2000MAY01 


2457 


2543 


forward 3 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


2580 


2666 


forward 3 


TM 


N out 


41 


U:1 084246.1 :2000MAY01 


2751 


2813 


forward 3 


TM 


N out 


41 


Ll:1084246.1 :2000MAY01 


2826 


2888 


forward 3 


TM 


N out 


41 


Ll:1 084246.1 :2000MAY01 


2961 


3047 


forward 3 


TM 


N out 


41 


Ll:1084246.1 :2000MAY01 


3249 


3335 


forward 3 


TM 


Nout 


41 


Ll:1 084246.1 :2000MAY01 


3429 


3515 


forward 3 


TM 


N out 


42 


Ll:1 1 65828.1 :2000MAY01 


61 


147 


forward 1 


TM 


Nout 


42 


LI:1165628.1:2000MAY01 


244 


312 


forward 1 


TM 


Nout 


42 


U:1 1 65828.1 :2000MAY01 


454 


510 


forward 1 


TM 


Nout 


42 


Ll:1 165828.1 :2000MAY01 


3664 


3750 


forward 1 


TM 


Nout 


42 


Ll:1 165828.1 :2000MAY01 


3937 


4023 


forward 1 


TM 


Nout 


42 


Ll:11 65828.1 :2000MAY01 


4600 


4653 


forward 1 


TM 


Nout 


42 


Ll:1 1 65828.1 :2000MAY01 


4855 


4941 


forward 1 


TM 


Nout 


42 


U:1 1 65828.1 :2000MAY01 


5047 


5133 


forward 1 


TM 


Nout 


42 


U:1 1 65828.1 :2000MAY01 


5227 


5298 


forward 1 


TM 


Nout 


42 


Ll:11 65828.1 :2000MAY01 


5311 


5388 


forward 1 


TM 


Nout 


42 


Ll:1 1 65828.1 :2000MAY01 


5491 


5577 


forward 1 


TM 


Nout 


42 


LI :11 65828.1 :2000MAY01 


5800 


5871 


forward 1 


TM 


Nout 


42 


11:11 65828.1 :2000MAY01 


227 


301 


forward 2 


TM 


Nin 


42 


Ll:11 65828.1 :2000MAY01 


713 


775 


forward 2 


TM 


Nin 


42 


U:1 1 65828.1 :2000MAY01 


1769 


1819 


forward 2 


TM 


Nin 


42 


U:11 65828.1 :2000MAY01 


2759 


2845 


forward 2 


TM 


Nin 


42 


Ll:11 65828.1 :2000MAY01 


3869 


3928 


forward 2 


TM 


Nin 


42 


Ll:1 165828.1 :2000MAY01 


4688 


4774 


forward 2 


TM 


Nin 


42 


U:1 1 65828.1 :2000MAY01 


5048 


5116 


forward 2 


TM 


Nin 


42 


Ll:11 65828.1 :2000MAY01 


5531 


5617 


forward 2 


TM 


Nin 


42 


U:1 1 65828.1 :2000MAY01 


5816 


5893 


forward 2 


TM 


Nin 


42 


U:1165828.1:2000MAY01 


39 


113 


forward 3 


TM 


Nout 


42 


Ll:11 65828.1 :2000MAY01 


906 


968 


forward 3 


TM 


Nout 


42 


U:1 1 65828.1 :2000MAY01 


1602 


1688 


forward 3 


TM 


Nout 


42 


Ll:1 1 65828.1 :2000MAY01 


3471 


3557 


forward 3 


TM 


Nout 


42 


U:1 1 65828.1 :2000MAY01 


3558 


3608 


forward 3 


TM 


Nout 


42 


Ll:1 1 65828.1 :2000MAY01 


4203 


4289 


forward 3 


TM 


Nout 


42 


U:1 1 65828.1 :2000MAY01 


4749 


4835 


forward 3 


TM 


Nout 
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42 


Ll:1 165828.1 :2000MAY01 


5625 


5690 


forward 3 


TM 


N out 


42 


Ll:11 65828.1 :2000MAY01 


5847 


5918 


forward 3 


TM 


N out 


43 


Ll:007302.1 :2Q00MAY01 


346 


426 


forward 1 


TM 


N in 


43 


Ll:007302.1 :2000MAY01 


2638 


2721 


forward 1 


TM 


Nin 


43 


Ll:007302.1 :2000MAY01 


59 


145 


forward 2 


TM 


N out 


43 


Ll:007302.1 :2000MAY01 


653 


718 


forward 2 


TM 


N out 


43 


Ll:007302.1 :2000MAY01 


1799 


1885 


forward 2 


TM 


Nout 


43 


Ll:007302.1 :2000MAY01 


321 


407 


forward 3 


TM 


Nin 


43 


LI:007302.1:2000MAY01 


480 


566 


forward 3 


TM 


Nin 


43 


LI:007302.1:2000MAY01 


645 


704 


forward 3 


TM 


Nin 


43 


Li:007302.1 :2000MAY01 


807 


890 


forward 3 


TM 


Nin 


43 


LI:007302.1:2000MAY01 


1161 


1223 


forward 3 


TM 


Nin 


43 


Ll:007302.1 :2000MAY01 


1236 


1298 


forward 3 


TM 


N in 


43 


Ll:007302.1 :2000MAY01 


1362 


1448 


forward 3 


TM 


N in 


43 


Ll:007302.1 :2000MAY01 


1809 


1868 


forward 3 


TM 


N in 


43 


Ll:007302.1 :2000MAY01 


1998 


2084 


forward 3 


TM 


Nin 


43 


Ll:007302.1 :2000MAY01 


2184 


2234 


forward 3 


TM 


Nin 


43 


LI:007302.1:2000MAY01 


2457 


2540 


forward 3 


TM 


Nin 


43 


LI:007302.1:2000MAY01 


2595 


2681 


forward 3 


TM 


Nin 


44 


LI:236386.4:2000MAY01 


3739 


3792 


forward 1 


TM 


Nout 


44 


LI:236386.4:2000MAY01 


53 


118 


forward 2 


TM 


Nout 


44 


LI:236386.4:2000MAY01 


218 


304 


forward 2 


TM 


Nout 


44 


LI:236386.4:2000MAY01 


3755 


3823 


forward 2 


TM 


Nout 


44 


LI:236386.4:2000MAY01 


2376 


2435 


forward 3 


TM 


Nout 


45 


LI:252904.5:2000MAY01 


494 


550 


forward 2 


TM 


Nout 


45 


LI:252904.5:2000MAY01 


300 


374 


forward 3 


TM 


Nout 
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I KIAA1431 protein [Homo sapiens] 1 
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hypothetical protein [Macaca fascicular is] 1 


unnamed protein product [Homo sapiens] 1 


| unnamed protein product [Homo sapiens] 1 


I unnamed protein product [Homo sapiens] | 


KIAA0455 protein [Homo sapiens] 1 
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CLAIMS 

What is claimed is: 

1. An isolated polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of: 

a) a polynucleotide sequence selected from the group consisting of SEQ ED NO:l-45, 

b) a naturally occurring polynucleotide sequence having at least 90% sequence identity to a 
polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-45, 

c) a polynucleotide sequence complementary to a), 

d) a polynucleotide sequence complementary to b), and 

e) an RNA equivalent of a) through d). 

2. An isolated polynucleotide of claim 1, comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO:l-45. 

3. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a polynucleotide 
of claim 1. 

4. A composition for the detection of expression of disease detection and treatment molecule 
polynucleotides comprising at least one of the polynucleotides of claim 1 and a detectable label. 

5. A method for detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 1, the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction 
amplification, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 

6. A method for detecting a target polynucleotide in a sample, said target polynucleotide 
comprising a sequence of a polynucleotide of claim 1 , the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, and which probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex 
is formed between said probe and said target polynucleotide or fragments thereof, and 
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b) detecting the presence or absence of said hybridization complex, and, optionally, if present, 
the amount thereof. 



7. A method of claim 5, wherein the probe comprises at least 30 contiguous nucleotides. 

8. A method of claim'5, wherein the probe comprises at least 60 contiguous nucleotides. 

9. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 1 . 

10. A cell transformed with a recombinant polynucleotide of claim 9. 

.11. A transgenic organism comprising a recombinant polynucleotide of claim 9. 

12. A method for producing a disease detection and treatment molecule polypeptide, the 
method comprising; 

a) culturing a cell under conditions suitable for expression of the disease detection and 
treatment molecule polypeptide, wherein said cell is transformed with a recombinant polynucleotide of 
claim 9, and 

b) recovering the disease detection and treatment molecule polypeptide so expressed; 

1 3. A purified disease detection and treatment molecule polypeptide (MDDT) encoded by at 
least one of the polynucleotides of claim 2. 

14. An isolated antibody which specifically binds to a disease detection and treatment molecule 
polypeptide of claim 13. 

15. A method of identifying a test compound which specifically binds to the disease detection 
and treatment molecule polypeptide of claim 13, the method comprising the steps of: 

a) providing a test confound; 

b) combining the disease detection and treatment molecule polypeptide with the test 
compound for a sufficient time and under suitable conditions for binding; and 
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c) detecting binding of the disease detection and treatment molecule polypeptide to the test 
compound, thereby identifying the test compound which specifically binds the disease detection and 
treatment molecule polypeptide. 

16. A microarray wherein at least one element of the microarray is a polynucleotide of claim 3. 

17. A method for generating a transcript image of a sample which contains polynucleotides, 
the method comprising the steps of: 

a) labeling the polynucleotides of the sample, 

b) contacting the dements of the microarray of claim 1 6 with the labeled polynucleotides of the 
sample undo* conditions suitable for the formation of a hybridization complex, and 

c) quantifying the expression of the polynucleotides in the sample. 

18. A method for screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence of claim 1 , the 
method comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, under conditions 
suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying amounts of 
the compound and in the absence of the compound. 

19. A method for assessing toxicity of a test compound, said method comprising: 

a) treating a biological sample containing nucleic acids with the test compound; 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at 
least 20 contiguous nucleotides of a polynucleotide of claim 1 under conditions whereby a specific 
hybridization complex is formed between said probe and a target polynucleotide in the biological 
sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide of claim 1 
or fragment thereof; 

c) quantifying the amount of hybridization complex; and 

d) comparing the amount of hybridization complex in the treated biological sample with the 
amount of hybridization complex in an untreated biological sample, wherein a difference in the amount 
of hybridization complex in the treated biological sample is indicative of toxicity of the test compound 
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20. An array comprising different nucleotide molecules affixed in distinct physical locations on 
a solid substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide or 
polynucleotide sequence specifically hyhridizable with at least 30 contiguous nucleotides of a target 
polynucleotide* said target polynucleotide having a sequence of claim 1. 

5 

21 . An array of claim 20, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide. 

22. An array of claim 20, wherein said first oligonucleotide or polynucleotide sequence is 
10 completely complementary to at least 60 contiguous nucleotides of said target polynucleotide 

23. An array of claim 20, which is a microarray. 

24. An array of claim 20, further comprising said target polynucleotide hybridized to said first 
15 oligonucleotide or polynucleotide. 

25. An array of claim 20, wherein a linker joins at least one of said nucleotide molecules to 
said solid substrate. 

20 26. An array of claim 20, wherein each distinct physical location on the substrate contains 

multiple nucleotide molecules having the same sequence, and each distinct physical location on the 
substrate contains nucleotide molecules having a sequence which differs from the sequence of 
nucleotide molecules at another physical location on the substrate. 

25 27. An isolated polypeptide comprising an amino acid sequence selected from the group 

consisting of: 

a) an amino acid sequence selected from the group consisting of SEQ ED NO:46-90, 

b) a naturally occurring amino acid sequence having at least 90% sequence identity to an 
amino acid sequence selected from the group consisting of SEQ ID NO:46-90, 

30 c) a biologically active fragment of an amino acid sequence selected from the group 

consisting of SEQ ID NO:46-90, and 

d) an immunogenic fragment of an amino acid sequence selected from the group consisting 
ofSEQIDNO:46-90. 
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<110> INCYTE GENOMICS, INC. 
PANZER , Scott R. 
SPIRO, Peter A. 
BANVILLE , Steven C. 
SHAH, Purvi 
CHALUP, Michael S. 
CHANG, Simon C. 
CHEN, Alice 
D'SA, Steven A. 
AMSHEY, Stefan 
DAHL, Christopher R. 
DAM, Tam C. 
DANIELS, Susan E. 
DUFOUR, Gerard E. 
FLO RES , Vincent 
FONG, Willy T. 
GREENAWALT, Li la B. 
HILLMAN, Jennifer L. 
JONES, Anissa L. 
LIU, Tommy F. 
ROSEBERRY, Ann M. 
ROSEN, Bruce H. 
RUSSO, Frank D. 
STOCKDREHER, Theresa K. 
DAFFO, Abel _ 
WRIGHT, Rachel J. 
YAP, Pierre E. 
YU, Jimmy Y. 
BRADLEY, Diana L. 
BRATCHER, Shawn R. 
CHEN, Wensheng 
COHEN, Howard J. 
HODGSON, David M. 
LINCOLN, Stephen E. 

<120> MOLECULES FOR DISEASE DETECTION AND TREATMENT 

<130> PT-1133 PCT 

<140> To Be Assigned 
<141> Herewith 

<150> 60/185,213/ 60/205,285; 60/205,232; 60/205,323; 60/205,287; 

60/205,324; 60/205,286 
<151> 2000-02-24; 2000-05-17; 2000-05-16; 2000-05-17; 2000-05-17; 

2000-05-17; 2000-05-17 

<160> 90 

<170> PERL Program 

<210> 1 

<211> 1378 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 977683 . 1 : 2000FEB18 
<220> 

<221> unsure 
<222> 1355 

<223> a, t, c, g, or other 
<400> 1 

caggagatgg cggcggcggc ggctagggat cagacatggc ggcggatctg aacctggagt 60 
ggatctccct gccccggtcc tggacttacg ggatcaccag gggcggccga gtcttcttca 120 
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tcaacgagga ggccaagagc accacctggc tgcaccccgt caccggcgag gcggtggtca 180 
ccggacaccg gcggcagagc acagatttgc ctactggctg ggaagaagca tatacttttg 240 
aaggtgcaag atactatata aaccataatg aaaggaaagt gacctgcaaa catccagtca 300 
caggacaacc atcacaggac aattgtattt ttgtagtgaa tgaacagact gttgcaacca 360 
tgacatctga agaaaagaag gaacggccaa taagtatgat aaatgaagct tctaactata 420 
acgtgacttc agattatgca gtgcatccaa tgagccctgt aggcagaact tcacgagctt 480 
caaaaaaagt tcataatttt ggaaagaggt caaattcaat taaaaggaat cctaatgcac 540 
cggttgtcag acgaggttgg ctttataaac aggacagtac tggcatgaaa ttgtggaaga 600 
aacgctggtt tgtgctttct gacctttgcc tcttttatta tagagatgag aaagaagagg 660 
gtatcctggg aagcatactg ttacctagtt ttcagataag ctttgcttac cctctgaaga 720 
tcacattaat cgcaaatatg cttttaaggc agcccatcca aacatgcgga cctattattt 780 
ctgcactgat acaggaaagg aaatggagtt gtggatgaaa gccatgttag atgctgccct 840 
agtacagaca gaacctgtga aaagagtgga caagattaca tctgaaaatg caccaactaa 900 
agaaaccaat aacattccca accatagagt gctaattaaa ccagagatcc aaaacaatca 960 
aaaaaacaag gaaatgagca aaattgaaga aaaaaaggca ttagaagctg aaaaatatgg 1020 
atttcagaag gatggtcaag atagaccctt aacaaaaatt aatagtgtaa agctgaattc 1080 
tctgccatct gaatatgaga gtgggtcagc atgccctgct cagactgtgc actacagacc 1140 
aatcaacttg agcagttcag agaacaaaat agtcaatgtt agcctggcag atcttagagg 1200 
tggaaatcgc cccaatacag ggcccttata cacagaggcc gatcgagtca tacagagaac 1260 
aaattcaatg cagcagttgg aacagtggat taaaatccag aaggggaggg gtcatgaaga 1320 
agaaaccagg ggagtaattt cttaccaaac attancaaga aatatgccaa gtcacaga 1378 

<210> 2 

<211> 662 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG : 893050 . 1 :2000FEB18 • 
<400> 2 

gggtcttaga gtttaccttc tacttccttt agagtgtctt cgcttttctc agggcacttg 60 
gaggtcctaa aactgctggt ggcacgggga gcagacctcg gctgcaaggc ccgcaagggc 120 
tatgggctgc tccatacagc tgctgccagt ggccagattg aagtggtgaa gtacctgctt 180 
cggatgggag cggagatcga tgaacccaat gcttttggaa acacagcttt gcacatcgcc 240 
tgctacctgg gccaggatgc tgtggctatt gagctggtga atgccggagc caatgtcaac* 300 
cagccgaatg acaagggctt cacgccactg catgtggctg cagtctcgac caatggcgct 360 
ctctgcttgg agctactggt taataatggg gctgacgtca actaccagag caaagaaggg 420 
aaaagtcctc tgcacatggc tgcaatccat ggccgtttca cacgctccca gatcctcatc 480 
cagaatggca gcgagattga ttgtgccgac aaatttggga acacgccact gcatgtggct 540 
gctcgatatg gacacgagct gctcatcagc accctcatga ccaatggcgc agataccggc 600 
cggcgtggca tccatgacat gttccccctg cacttagctg ttctctttgg attctctgac 660 
tg 662 

<210> 3 

<211> 2764 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: DG : 980153 . 1 : 2000FEB18 
<220> 

<221> unsure 
<222> 2663 

<223> a, t, c, g, or other 
<400> 3 

ccccgttccc gattcatgta gtagcggctg tattgcagcc gcctgccgaa ctgacccggg 60 
tctggggact ggcccctctg gcgccgttcg gtttctctta ttgccttcac tgaggatgag 120 
tccctttgtg gctctatgtg gaccctgcgg aatccaccgg cgcagtttca tctagcgact 180 
ggtcaccctt gcaattatgg atatttaaaa gggtcagaca gtgtggaggg ggagttcccc 240 
tcctcactcc cccttggtgc ttgactccag gaataattta taaactgtgg aattttttta 300 
aatgaagaac ttgtatttga tatgaacttt atagagctat ttataatttt tttgatttaa 360 
gtgccaaaaa aattgtataa agatatatag ttttatacta ttgtcaggag gatttaaatt 420 
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atcctaaaaa ggtaatttat tctctgtaac ttcctcaata gcacctttgt gtcctggctt 480 
tttcattttt taaaattagt tttcacgatt ctgaagtaag tggtataaaa acagttagga 540 
tgagttcacc catgcctgac tgcacatcaa agtgtcgatc cctgaagcat gctttggaag 600 
tcccttctgt ggtaacaaag gggagcgaaa acccgattaa ggcccttctc tccacgtcat 660 
tgttacaaag ctgccactat caaggatgtt tttggcagga atgccctcca cccctgtttc 720 
ctcctcgtgg agaagaaagg agtgttagat tggcttattc agaaaggagt ggatctgttg 780 
gtgaaagaca aagagtctgg atggacagca ttgcacagaa gcatttttta tggacatatt 840 
gattgtgttt ggtctctatt gaagcatggt gttagtctgt atattcaaga taaagaaggc 900 
ttgtcagctt tggatcttgt aatgaaggat agaccaactc atgtagtatt caagaatact 960 
gatcctacag atgtttatac ttggggcgat aatacaaatt ttaccctggg tcatggaagc 1020 
cagaatagca aacatcatcc agagttggtg gatctgttct ccaggagtgg gatttatatc 1080 
aagcaggtgg tgctttgtaa atttcactcc gtgtttctgt ctcagaaagg gcaggtttat 1140 
acctgtggtc atggtcctgg agggcgatta ggacatggag atgaacagac atgcttggtc 1200 
cctcggcttg tggaaggact gaatggtcat aattgttccc aagtggcagc tgctaaggat 1260 
catactgttg tattaactga agatggatgt gtttatacat ttggtctaaa catttttcat 1320 
caattaggaa ttattccacc gccttccagt tgtaatgtac ccagacagat acaggcaaaa 1380 
tatctgaaag gaaggacaat cattggcgtt gcagcaggca ggtttcatac agtcctatgg 1440 
actagagaag ctgtttacac tatgggacta aatggtggac aactgggttg tttgctagat 1500 
cccaatggag aaaagtgtgt aactgctcct cgtcaggtct ctgcccttca ccataaagac 1560 
attgctctgt ctttggttgc tgcaagtgat ggagctacag tctgtgttac cacaagggga 1620 
gatatttact tacttgcaga ctatcagtgc aagaagatgg cttctaaaca gttgaacttg 1680 
aaaaaagttc ttgtgtctgg gggtcatatg gaatacaagg ttgatcctga acatttgaaa 1740 
gaaaatgggg gtcaaaaaat ttgcattctt gcaatggatg gagctggaag ggtgttttgc 1800 
tggagatcag tcaacagttc tctgaagcag tgtcgatggg cctatccacg tcaggtcttc 1860 
atttctgata ttgctttaaa tagaaatgaa attctatttg ttacgcaaga tggagaagga 1920 
tttagaggga gatggtttga agagaaaaga aagagttctg aaaagaaaga gattttatca 1980 
aaccttcaca attcctcatc agatgtgtct tatgtctctg atataaatag tgtgtatgaa 2040 
agaattcgac ttgagaaact tacctttgca catagagctg ttagtgtcag cacagatcca 2100 
agtggatgca actttgcaat cctgcagtca gatcctaaaa caagccttta tgaaattcca 2160 
gctgtgtcct catcatcctt ttttgaagag tttggcaaac tgttgaggga agcagatgaa 2220 
atggacagca ttcatgatgt gacatttcaa gttggcaata gactcttccc tgcacataaa 2280 
tatattttgg cagtgcattc tgattttttt cagaaattgt ttctttcaga tggtaatact 2340 
tcagaattta cagatattta ccagaaagat gaagattctg cagggtgcca tctctttgtg 2400 
gtagagaagg ttcatcctga catgtttgaa taccttttac aatttatata cacagatact 2460 
tgtgactttt taactcatgg cttcaaacca agaatacact taaacaaaaa cccagaagaa 2520 
tatcagggaa ctctgaattc tcatttgaat aaagtgaatt tccatgaaga tgataaccag 2580 
aagtctgcat ttgaagttta caaaagtaat caagctcaaa cagttagtga gaggcagaag* 2640 
agcaaaccta aatcttgtaa aanaggaaaa aatattaggg aagatgatcc tgtaagaatg 2700 
ttgcaaactg ttgcaaagaa attcgacttc agtaatttga gtagtaggtt agatggagtc 2760 
agat . 2764 

<210> 4 
<211> 388 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG:350398 . 1 :2000FEB18 
<220> 

<221> unsure 
<222> 125 

<223> a, t, c, g, or other 
<400> 4 

cccttctacg tccgctgcat caagcccaat gaggacaagg tagctgggaa gctggatgag 60 
aaccactgtc gccaccaggt cgcatacctg gggctgctgg agaatgtgag ggtccgcagg 120 
gctgnttcgc ttcccgccag ccctactctc gattcctgct caggtactgg cacctgacac 180 
ccatcactcc atgggccata gtccctgtgt ggagtccaag gggtaggagc agagggtccc 240 
caaacagcac gtcgcaaaca tcgatacaag caggaaccag cacgctgctg gcctcaagac 300 
accaaaatat ctgggaagac atgtgtgtga gcacatgcat gtggggacat acaggtggga 360 
acatgggtat gagggctgtg tgaggaca 388 

<210> 5 
<211> 2364 
<212> DNA 
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<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG:475551.1:2000FEB18 
<220> 

<221> unsure 
<222> 424-425 

<223> a, t, c, g, or other 
<400> 5 

gtctgcaggg ccagagcggg gcagacatgg acaagcgggt gaagaagctt cccctcatgg 60 
ctctgtccac cacgatggct gagagcttca aggagctgga ccctgattcc agcatgggga 120 
aggccttgga gatgagctgt gccatccaga atcagctggc ccgcatcctg gccgagtttg 180 
agatgaccct ggagagggac gtcctgcagc cactcagcag gctgagtgag gaggagctgc 240 
cagccatcct caaacacaag aaaagcctcc agaagctcgt gtccgactgg aacacactca 300 
agaacaggct cagtcaggca accaagaatt caggcagcag tcaaggccta ggaggcagcc 3 60 
cgggtagtca cagccatacg accatggcca acaaggtgga gacgctgttc tactgeagca 420 
ggtnntcacc caggaaagtg gagcaatgca gggacgagta cttggctgac ctgtaccact 480 
ttgttaccaa ggaggactcc tatgccaact acttcattcg tctcctggag attcaggccg 540 
attaccatcg caggtcactg agctcgctgg acacagccct ggctgagctg agggagaacc 600 
acggccaagc agaccactcc ccttcgatga cagccaccca cttccccagg gtgtatgggg 660 
tgtcgctggc aacccacctg caagagctgg gccgggagat tgccctgccc atcgaggcct 720 
gcgtcatgat gctgctttct gagggcatga aggaagaggg tctcttccgt ctggctgctg 780 
gggcctcggt gctgaagcgt ctcaagcaga caatggcctc ggacccccac agcctggagg 840 
agttctgctc cgacccgcac gctgtggcag gtgccctcaa gtcctatctg cgggagctgc 900 
cagagcctct gatgaccttc gacctctatg atgactggat gagggcagcc agcctgaagg 960 
agccaggggc ccggctgcag gccctccaag aggtgtgcag ccgcctaccc cccgagaacc 1020 
tcagcaacct caggtacctg atgaagttcc tggcacggct ggccgaggag caggaggtga 1080 
acaagatgac acccagcaac atcgccatag tcctgggacc caacttgctg tggccacctg 1140 
agaaagaagg ggaccaggcc cagctggatg cagcctccgt gtcttccatc caggtggtgg 1200 
gcgtcgtcga ggcgctgatc cagagcgcag acaccctctt ccctggagac atcaacttca 1260 
acgtgtcagg cctcttctca gctgttaccc tccaggacac agtcagtgac aggctggcct 1320 
ctgaggaact tccgtccact gccgtgccca ccccagccac caccccggct ccggctccgg 1380 
ctccagctcc agctccggcc ccagccttgg cttcagcagc taccaaggaa aggacagagt 1440 
ctgaggtgcc tcccagacca gcctccccca aggtcaccag gagtcccccg gagacagctg 1500 
ccccagtgga ggacatggct cggaggacca agcgcccggc gccagcccgg cccaccatgc 1560 
cgccccccca ggtctccggc tcccgctcct cccctccagc cccgcccttg ccccctggct 1620 
ctggcagccc tgggaccccc caagccctgc cccgacgtct ggttggcagc agcctccgag 1680 
cccccacagt gccacccccg ttacccccca, caccccctca gcctgcccgg cgccaaagcc 1740 
ggcgttcacc agcctccccc agcccggcct ccccaggtcc agcctccccc agcccagtct 1800 
ctttgagtaa ccctgcacag gtggacctgg gggctgccac agcagaggga ggagcccctg 1860 
aggctatcag tggggtcccc actcccccag ctatcccccc tcagccccgc cccaggagcc 1920 
ttgcctcaga gaccaactga gtggctggtt tctccctaag cagccctcag caccccctcc 1980 
ctccccacct ggccctccca ggacagctct cgccccccac aaaggggcat gggcctccag 2040 
cctttgccca caagtgcctc agtgcccact gggtcggccc ccatggccag gagggctcag 2100 
gacaatcctc tatttcctga ccttttcctc gtccaccctg ggcttgggga cccccccacc 2160 
ggactctcca ctctccggca ggtcctaggg gagccaccgg aaggaaggag aggtttgcct 2220 
gctcctacgg gactgattct tctcttgccg acatgttttt tgtaaggctg gtaaataaat 2280 
tattttggac aaaactggag cagctgccca aatgatagtt ttattttctg tccttgaaat 2340 
aaagaagcca attttataaa gggg 23 64 

<210> 6 

<211> 1801 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 481407 . 2 : 2000FEB18 
<400> 6 

cgccttcggg gcccggatct caaacagtcg ggaagaagca ccgtggctgc tattatctgc 60 
tctccgcgcc tgacccctcc caggactcgt gatgccaagg ccgctgcgag cggctacgaa 120 
gagtcggggt tgagccccag ctgagccgag ggctcgcact cttctggtct cccaggccca 180 
acccacctga agaaatgagt ggtggattgg ctccaagtaa gagcacagtg tatgtatcca 240 
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acttgccttt ttccctgaca aacaatgact tgtaccggat attttccaag tatggcaaag 300 
ttgtaaaggt taccatcatg aaagataaag ataccaggaa gagtaaaggg gttgcattta 3 60 
ttttattttt ggataaagac tctgcacaaa actgtaccag ggcaataaac aacaaacagt 420 
tatttggtag agtgataaaa gcaagcattg ctattgacaa tggaagagca gctgagttca 480 
tccgaaggcg aaactacttt gataaatcta agtgttatga atgtggggaa agtggacact 540 
taagttatgc ctgtccgaaa aatatgctcg gagaacgtga gcctccaaag aagaaagaaa 600 
aaaaagaaaa aaaagaaagc tcctgaacca gaagaagaaa ttgaggaagt agaagaaagt 660 
gaagatgaag gggaggatcc tgctcttgac agcctcagtc aggccatagc attccagcaa 720 
gccaaaattg aagaagaaca aaaaaaatgg aaacccagtt caggagtccc ctcaacatca 780 
gatgattcaa gacgcccaag gataaagaaa agcacatatt tcagtgatga ggaagaactt 840 
agtgattaaa atcttgcccc agcacagtaa taaaaatcaa gatttgttag taacaatctt 900 
gaagagctaa ttttaataaa aataagaaaa attaatacta tcatgttaat actattattg 960 
tcatcccaag aaaaaagata ttttaaaaat ttatttgaaa agttcattat aagggcttta 1020 
ttcatgcctg atttgtttac atgaggactt ctgaaattaa tccttaaaac aaacttcctg 1080 
aagaccgaaa agttgaatga tttattgtta cttatattaa taaacttttc aagagaattt 1140 
tgtctttaaa tatgggtgtt ttgtcatcat atttcttgta gctttatccc aatctggata 1200 
aattgtaaat acctataaaa taaattataa atacctataa aatataaagt aacatagctc 1260 
taaaaggctt aaaatcaaac acaggtgtta tttgtctgcc ctacccatag caccaaattc 1320 
cattccctag aagaaactac ttacatgtgc acacatgtag ctaaataaag tacatatatt 1380 
tcagtcactt aataaaagaa taaaggagaa ttattcaatc tcttatactt ctcctacctt 1440 
cctcaatatt cccaatgtgg ctgtattaaa aatttggggg aatccatata tatctttttt 1500 
aataaccaag taaatacttt tcatttctga gccaaggaat atgctatgat tacgtttttt 1560 
cctagagtta ataattgtct attttttttc catgtattgt ctttgtattt atgactaaat 1620 
cttcccattc tgtctgcagg tgggtatatg gtaatgggat tagagagcct ttaattttct 1680 
gctttgtata tttctatatt gtttaacttt gtaagaatgc ccattacttt tttaactagt 1740 
aaaagcaata gaaataagtt aatactatca tagtaatatt attattgtca tcccgaggcg 1800 
g 1801 

<210> 7 

<211> 730 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 443580 . 1 : 2000FEB01 
<220> 

<221> unsure 
<222> 44 

<223> a, t, c, g, or other 
<400> 7 

ggaggtgaga tattttggtc cccaggagaa ctggctcagg tctncaagtt cccatccggg 60 
atgactggaa agggttagga aacctctctg aggtctggtc agattccaac cctggacagc 120 
agtgaacaca acctttcccc tgagccactg gaattggaca gaatgcccca ttctcctctg 180 
atctccattc ctcatgtgtg gtgtcaccca gaagaggagg aaagaatgca tgatgaactt 240 
ctacaagcag tatccaaggg gccggtgatg ttcagggatg tttccataga cttctctcaa 300 
gaggaatggg aatgcctgga cgctgatcag atgaatttat acaaagaagt gatgttggag 3 60 
aatttcagca acctggtttc agtgggactt tccaattcta agccagctgt gatctcctta 420 
ttggaacaag gaaaagagcc ctggatggtt gatagagagc tgactagagg cctgtgttca 480 
gatctggaat caatgtgtga gaccaaaata ttatctctaa agaagagaca tttcagtcaa 540 
gtaataatta cccgtgaaga catgtctact tttattcagc ccacatttct tattccacct 600 
caaaaaacta tgagtgaaga gaaaccatgg gaatgtaaga tatgtggaaa gacctttaat 660 
caaaactcac aatttatcca acatcagaga attcattttg gtgaaaaaca ctatgaatct 720 
aaggaaaaaa 73 0 

<2.10> 8 

<211> 457 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI: 803015 . 1 :2000FEB01 
<400> 8 
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gcgcgggctg 
caggcttctg 
aggaccccgg 
tgaacttcac 
aagtgatgct 
acattgaata 
tcaatgaaat 
ggctgaactt 



cggctgggat 
tcactctgtc 
gacatctgaa 
ccaggaggag 
ggaaactttc 
tgagtaccaa 
taaagatgac 
ccaggagaag 



ccggtctttc 
acctacgcta 
agccaggaaa 
tgggctttgc 
aggaacctga 
aaccccagga 
agtcattgtg 
aaagcttctc 



cagccccgag 
tgccctgctg 
tggacccagt 
tggatatttc 
cctctgtagg 
gaaacttcag 
gagaaacttt 
ctgaaat 



agggacctgg 
tagtcacagg 
ggcctttgat 
ccagaggaaa 
aaaaagttgg 
gagtctcata 
tacccaggtt 



ttcctctgcc 60 
aggtgtagag 120 
gatgttgctg 180 
ctctacaagg 240 
aaagaccaga 300 
gaaaagaaag 360 
ccagatgaca 420 
457 



<210> 9 

<21l> 582 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No: 



LG:027410.3 :2000MAY19 



<400> 9 

ggcacaccga 

ccgccccttc 

accgtcctgg 

tggcccacct 

acttcctcct 

ctggccaggc 

aggtactgca 

ggtgcagagc 

gcagctgcgg 

tgagaacagc 



ggctcggccg 
atgcaggaac 
tgtactgggc 
gggctgccat 
ggccacaggc 
cctgccctca 
gttccacccc 
gccgtctgga 
atctttgacc 
agggatagcc 



ccccgccgcg 
cacatcaaat 
attgtgcctc 
tcagacctag 
tcggctgaca 
gcacccgggg 
acctctgacg 
gccgagatgg 
ccagaacaaa 
ggctggcatg 



agtcctggat 
caagctgcag 
tgcaaggcca 
tcaccgactt 
ggacggtaaa 
tggtgctggg 
gcattctgag 
agccctggtg 
gccgcgggcc 
gatgggcacc 



cagtgacatt 
cttgatcgcc 
aggagaggac 
ggacttctcg 
actctggcga 
ccccgaggac 
ctggcagccc 
ggcacggcgt 
tctcagagca 
tg 



cgagcaggaa 60 
ttcaactccg 120 
aagcgacgcg 180 
ccctttgatg 240 
ctgccagggc 300 
ctcccagtgg 360 
atggggacct 420 
gcaaggacaa 480 
cgcaggccca 540 
582 



<210> 10 

<211> 848 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 171377 . 1 :2000MAY19 



<400> 10 

agcggccgca 

gccagactgg 

gccccggacc 

aagcccccac 

cggcgggacg 

gctggcggcg 

ttttcgcctt 

acaacgaagc 

accggatcca 

tcatggggga 

tctataccat 

aacgcttccc 

ctgcagctgc 

tgacagcagc 

cctctatg 



gcctctgaga 
actccggccc 
tgcagctccc 
gccgcgccca 
gcgggacagg 
gctggaggag 
cgggtcctgt 
caaggacgtg 
atatgagatg 
cttctctgca 
ggctgcccta 
gctggtggac 

ct ggggcaag 

catgtcagtg 



gcacgaacag 
accgacggcc 
cgctcccccg 
gccgtcgccg 
tcgccgcgcc 
ccgctgggct 
ggctcctaca 
agctccatca 
cccctctgcg 
cccgccgagt 
gttatctacc 
ttctgtgtga 
ggcctgaccg 
tgccatggag 



cagcgccccc 
gctcgcgctc 
ccgtgtccgc 
cgccgagcat 
agcaggtgga 
tcatcaaagt 
gcggggagac 
tcgttgcatt 
atgaagagtc 
tcttcgtgac 
tgcgcttcca 
ctgtctcctt 
atgtcaaggg 
aggaagcagt 



gcgtcccagc 
cggccccgct 
cgcctcccgg 
gtcctcgacc 
ccgcctactc 
tctccagtgg 
aggagcaatg 
tggctatccc 
cagctccaag 
ccttggcatc 
caacctctac 
caccttcttc 
ggccacacga 
gtgcagtgcc 



c age cage ca 60 
cgcctgctct 120 
ccagagagcc 180 
gagaggegee 240 

gtggggctgc 300 

etctttgeta 3 60 
gttegctgea 420 
tgcaggttgc 480 
accatgcacc 540 
ttttccttct 600 
acagagaaca 660 
tggctggtag 720 
ccatccagct 780 
ggggccacgc 840 
848 



<210> 11 

<2U> 636 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> raisc_feature 

<223> Incyte ID No: LG:352559 . 1 : 2000MAY19 
<400> 11 

tgtagtttcc tcaactactg cctcagctct acaatcccag agtaaagctc ttctccaaat 60 
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gaagagccag gaagaggtag aggtggcagg aattaaactt tgtaaagcca tgtccctggg 120 
ttcactgact ttcacagatg tggccataga cttttcccaa gatgaatggg agtggctgaa 180 
tcttgctcag agaagtttgt acaagaaggt gatgttagaa aactacagga acctagtttc 240 
agtgggtctt tgcatttcta aaccagatgt gatctcctta ctggagcaag agaaagaccc 300 
ttgggtgata aaaggaggga tgaacagagg cctgtgccca gacttggagt gtgtgtgggt 360 
gaccaaatca ttatctttaa accaggatat ttatgaagaa aaattacccc cggcaatcat 420 
aatggaaaga cttaaaagct atgaccttga atgttcaaca ttagggaaaa actggaaatg 480 
tgaagacttg tttgagaggg agcttgtaaa ccagaagaca cattttaggc aagagaccat 540 
cactcatata gatactctta ttgaaaaaag agatcactct aacaaatctg ggacagtttt 600 
tcatctgaat acattatctt atataaaaca gatttt 63 6 

<210> 12 

<211> 2110 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: LG:247384 . 1 :2000MAY19 
<400> 12 

ccaggagaag gaagccaaca ggatccgacc cggtgttttg tgacaaaggc aagaccccca 60 
ggtctactta .gagcaaagtt agtagaggag gcagctaggc gtggctctca ttccttccca 120 
cagaatggat tataagtcga gcctgatcca ggatgggaat cccatggaga acttggagaa 180 
gcagctgatc tgccctatct gcctggagat gtttaccaag ccagtggtca tcttgccgtg 240 
ccagcacaac ctgtgccgga agtgtgccaa tgatattttc caggcctcta acccgtattt 3 00 
gcccacaaga ggaggtacca ccatggcatc agggggccga ttccgctgcc catcctgtag 3 60 
acatgaagtg gttttggata gacatggggt atatggactt cagaggaacc tgctggtgga 420 
gaacatcatc gacatctaca aacaggagtg ctccagtcgg ccgctgcaga agggcagtca 480 
ccccatgtgc aaggagcacg aagatgagaa aatcaacatc tactgtctca cgtgtgaggt 540 
gcccacctgc tccatgtgca aggtgtttgg gatccacaag gcctgcgagg tggccccatt 600 
gcagagtgtc ttccagggac aaaagactga actgaataac tgtatctcca tgctggtggc 660 
ggggaatgac cgtgtgcaga ccatcatcac tcagctggag gattcccgtc gagtgaccaa 720 
ggagaacagt caccaggtaa aggaagagct gagccagaag tttgacacgt tgtatgccat 780 
cctggatgag aagaaaagtg agttgctgca gcggatcacg caggagcagg agaaaaagct 840 
tagcttcatc gaggccctca tccagcagta ccaggagcag ctggacaagt ccacaaagct 900 
ggtggaaact gccatccagt ccctggacga gcctggggga gccaccttcc tcttgactgc 960 
caagcaactc atcaaaagca ttgtggaagc ttccaagggc tgccagctgg ggaagacaga 1020 
gcagggcttt gagaacatgg acttctttac tttggattta gagcacatag cagacgccct 1080 
gagagccatt gactttggga cagatgagga agaggaagaa ttcattgaag aagaagatca 1140 
ggaagaggaa gagtccacag aagggaagga agaaggacac cagtaaggag ctggatgaat 1200 
gagaggcccc cagatgcaga gagactggag agggtgggga ggggcccagc ggccttggtg 1260 
acaggcccag ggtgggaggg gtcggggccc ctggaggggc aatggggagg tgatgtcttc 1320 
tctctgctca gagagcaggg actagggtag gaccctcacc gctgcgtcca gcagacactg 1380 
aaccagaatt ggaaacgtgc ttgaaacaat cacacaggac acttttctac attggtgcaa 1440 
aatggaatat tttgtacatt tttaaaatgt gatttttgta tatacttgta tatgtatgcc 1500 
aatttggtgc tttttgtaaa ggaacttttg tataataatg cctggtcatt gggtgacctg 1560 
cgattgtcag aaagagggga aggaagccag gttgatacag ctgcccactt cctttcctga 1620 
gcaggaggat ggggtagcac tcacagggac gatgtgctgt atttcagtgt ctatcccaga 1680 
catacggggt ggtaactgag tttgtgttat atgttgtttt aataaatgca caatgctctc 1740 
ttcctgttct tcaaaggagc cggggtttca ttcagccttt ttttcctgga gatgagggtt 1800 
gagtgtgaat gaacaggacc cctggtagga ggcaatggca gggctaggct taggtcccag 1860 
taaaggagtt ctcgacacca ccatttccca atgtggactc catggaaagc cagccctgag 1920 
ctggtccttc aagaacaggt tcaatgtgtt gttgctcCgg ttctccagaa aacagagcct 1980 
gaggcaaaat ttaaatgctt tagttgcagg ttatagggac ttccccgtgc tcactgaagg 2040 
ctcactgaag gctcactgaa aatcatcaag aagaggcaga ttaaggctgg gcaggtgcag 2100 
tggttcatgc 2110 

<210> 13 

<211> 2375 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 403872 . 1 : 2000MAY19 
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<220> 

<221> unsure 
<222> 1233 

<223> a, t # c, g, or other 



<400> 13 

gcagcgccag gaggaggcag cggaggaagc agagcgcggg atgggcgccc agcggcatct 60 
gtgatcccgc gcacctccgc cccacgggcg cgcgcacaaa cacggacaca cacatacaca 120 
cactcgcgca cacactcgca caaacacaca ctcgtacacg cccgcgccgc tcgctcgccg 180 
gcttgctctc ccacgcaagc ggaatgcagc agcgcctgga gagcgtgtct cggaccgccg 240 
cctgaatgta cctcgctccc gggagccgga cggcccagta gggcgcactg gaggacgctc 300 
cgctgcggga gcctggacag tttttgacgg tgcagtcttg ctatatggtg tgagaaatgg 360 
ctgtaggaaa caacactcaa cgaagttatt ccatcatccc gtgttttata tttgttgagc 420 
ttgtcatcat ggctgggaca gtgctgcttg cctactactt cgaatgcact gacacttttc 480 
aggtgcatat ccaaggattc ttctgtcagg acggagactt aatgaagcct tacccaggga 540 
cagaggaaga aagcttcatc acccctctgg tgctctattg tgtgctggct gccaccccaa 600 
ctgctattat ttttattggt gagatatcca tgtatttcat aaaatcaaca agagaatcce 660 
tgattgctca ggagaaaaca attctgaccg gagaatgctg ttacctgaac cccttacttc 720 
gaaggatcat aagattcaca ggggtgtttg catttggact ttttgctact gacatttttg 780 
taaacgccgg acaagtggtc actgggcact taacgccata cttcctgact gtgtgcaagc 840 
caaactacac cagtgcagac tgccaagcgc accaccagtt tataaacaat gggaacattt 900 
gtactgggga cctggaagtg atagaaaagg ctcggagatc ctttccctcc aaacacgctg 960 
ctctgagcat ttactccgcc ttatatgcca cgatgtatat tacaagcaca atcaagacga 1020 
agagcagtcg actggccaag ccggtgctgt gcctcggaac tctctgcaca gccttcctga 1080 
caggcctcaa ccgggtctct gagtatcgga accactgctc ggacgfcgatt gctggtttca 1140 
tcctgggcac tgcagtggcc ctgtttctgg gaatgtgtgt ggttcataac tttaaaggaa 1200 
cgcaaggatc tccttccaaa cccaagcctg agnatccccg tggagtaccc ctaatggctt 1260 
tcccaaggat agaaagccct ctggaaacct taagtgcaca gaatcactct gcgtccatga 1320 
ccgaagttac ctgagacgac tgatgtgtca caagctgttt tttaaaatca tcttccaatt 1380 
ctatacttca aaacacacag ttgctcaatg tcaaactgtg atgacaaata ttacgtttat 1440 
ctagttagaa gctaatgttt tgtacatttt ttgtatgagg aagtgatgta gcttgccctg 1500 
attttttttt tttttttttg gtcagcttta atatatttat gccagaattt taaaaccaac 1560 
aaaattttct tgttcaagcg tgcattgaag aaccacattt attcaatggt tgacgttgtt 1620 
ttgtgatatt tgtacacaaa ttttcttttc tcagttttat aaacacagaa gtaaatataa 1680 
caattcactt taaactttta ttaccacagt tgctgcctcc tccagaattt ttgaatttta 1740 
ataaaaggca aacttttgag ctgcaggaag gacaatgttg gttaataata aatctcaaag 1800 
tcaattgtag aaaaaaaatt gtcttcaaaa agaatgttgc actctgatct cttaacaaat 1860 
tgttacgttc aaagtttaaa gtgatatatt aacaaagtca cctagttata caaacaattg 1920 
tcagagaatt ctggatttgg agggtattgg ggttatatga ttctttctta gataatggcc 1980 
tctactaaat aactcaagat ctttctggaa tgtcttctgg caggcaggtg ccactgtcag 2040 
cttttctcca aaaagcagcc aacatcagcc tcccctgtca actcaacagt tttgtatctc 2100 
atattatatg gactttatat gaaaatgaat attttacagt ttgcacagta ttattttaca 2160 
gaaaaggaat cagagaatct acaacatagg gccccagaac aacagtttca ctttgtggct 2220 
tttaattatt ctagaatttt aactgcatct catttttcta gcatggtgag aactaatatg 2280 
taactccttt gattgaagga gctcttttgt ccgtacctat cagaatgttt tcttgacact 2340 
tccatgttgg ctcttctcag ctttttttgt acata " 2375 

<210> 14 

<211> 537 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 1135213 .1 :2000MAY19 
<400> 14 

ggacccgcga gcggagcggc gcgtgggtcg gttgcggtcg gccccggcag gatgggaagg 60 
ccattgtgac tatgtggtga ttacagttgt cttactactg agtttcctac tgaaatcatg 120 
gaggagaaac agcagattat attggctaat caagatggtg gaacagtggc aggagcagca 180 
cctaccttct ttgtcatctt aaagcagcca ggaaatggca aaactgatca aggaattttg 240 
gttactaatc aggatgcctg tgctttggct agtagtgtgt catcaccagt aaaatctaaa 3 00 
gggaagattt gccttccagc tgattgtact gtgggtggaa tcactgttac cctcgataac 3 60 
aatagtatgt ggaatgagtt ctatcatcga agcacagaga tgattctgac caagcaagga 420 
agacgcatgt ttccttactg tcgttattgg ataacaggtt tagattcaaa tttgaagtat 480 
attcttgtca tggatatatc tcctgtggat aaccatcgtt ataagtggaa tggtcgt 537 
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<210> 15 

<211> 1433 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG:474284.2:2000MAY19 
<400> 15 

ggcctgcccc ggccccctgc ccgcggcgcc atggcggaga attggaagaa ctgcttcgag 60 
gaggagctca tctgccctat ctgcctgcac gttttcgtgg agccagtgca gctgccgtgc 120 
aaacacaact tctgccgggg ctgcatcggc gaggcgtggg ccaaggacag cggcctcgta 180 
cgctgcccag agtgcaacca ggcctacaac cagaagccgg gcctggagaa gaacctgaag 240 
ctcaccaaca tcgtggagaa gttcaatgcc ctgcacgtgg agaagccgcc ggcggcgctg 300 
cactgcgtgt tctgccgccg cggccccccc gctgcccgcg cagaaggtct gcctgcgctg 360 
cgaggcgccc tgctgccagt cccacgtgca gacgcacctg cagcagccct ccaccgcccg 420 
cgggcacctc ctggtggagg cggacgacgt gcgggcctgg agctgcccgc agcacaacgc 480 
ctaccgcctc taccactgcg aggccgagca ggtggccgtg tgccagtact gctgctacta 540 
cagcggcgcg catcagggac actcggtgtg cgacgtggag atccgaagga atgaaatccg 600 
gaagatgctc atgaagcagc aggaccggct ggaggagcga gagcaggaca ttgaggacca 660 
gctgtacaaa ctcgagtcag acaagcgcct ggtggaggag aaagtgaacc aactgaagga 720 
ggaagttcgg ctgcagtacg agaagctgca ccagctgctg gacgaggacc tgcggcagac 780 
agtggaggtc ctagacaagg cccaggccaa gttctgcagc gagaacgcag cgcaggcgct 840 
gcacctcggg gagcgcatgc aggaggccaa gaagctgctg ggctccctgc agctgctctt 900 
tgataagacg gaggatgtca gcttcatgaa gaacaccaag tctgtgaaaa tcctgatgga 960 
cagcagatgc cccgtccact ggccccagga cccagacctg cacgagcagc agcctttccc 1020 
ccactaagat cggccacctg aactccaagc tcttcctgaa cgaagtggcc aagaaggaga 1080 
agcagctgcg gaaaatgcta gaaggcccct tcagcacgcc ggtgcccttc ctgcagagtg 1140 
tccccctgta cccttgcggc gtgagcagct ctggggcgga aaagcgcaag cactcaacgg 1200 
ccttcccaga ggccagtttc ctagagacgt cgtcgggccc tgtgggcggc cagtacgggg 1260 
cggcgggcac agccagcggt gagggccagt ctgggcagcc cctggggccc tgcagctcca 1320 
cgcagcactt ggtggccctg ccgggcggcg cccaaccagt gcactcaagc cccgtgttcc 13 80 
ccccatcgca gtatcccaat ggctccgcgc ccagcagccc atgctccccc agt 1433 

<210> 16 

<211> 654 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG : 342147 . 1 : 2000MAY19 
<400> 16 

cgaattgggc ccctagatgt ttgctcgagc ggcggccgca gtgtgctgga aagggacaaa 60 
gacttgtaac tggagaaata gtttgtaagg gagatttttc ttcctctacc cacacctttc 120 
aaggcaggga gcaatgaaag acaaacctgt actgttcacc atatttcatt gattgcaata 180 
ggagtattga ggtcactttt atattgtcct ggatagtatg tagttacgcg gtttgtaaag 240 
agaggaatgg gatggggggc tgtgagaagg aagaattagt ggtcgatttc ggaggagcag 300 
gatggagatc cctgtgcctg tgcagccgtc ttggctgcgc cgcgcctcgg ccccgttgcc 360 
cggactttcg gcgcccggac gcctctttga ccagcgcttc ggcgaggggc tgctggaggc 420 
cgagctggct gcgctctgcc ccaccacgct cgccccctac tacctgcgcg cacccagcgt 480 
ggcgctgccc gtcgcccagg tgccgacgga ccccggcccc ttttcggtgc tgctagacgt 540 
gaagcacttc tcgccggagg acattgctgt caaggtggtg ggcgaacacg tggaggtgca 600 
cgcgcgccac gaggagcgcc cggatgagca cggattcgtc gcgcgcgagt tcca 654 

<210> 17 

<211> 1651 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: LG: 1097300 .1:20 00MAY19 
<400> 17 
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gccgccgagg aggaggccct gctggtttct gtgcgggctc ttgtccagga tggtgaagct 60 
gttcatcgga aacctgcccc gggaggctac agagcaggag attcgctcac tcttcgagca 120 
gtatgggaag gtgctggaat gtgacatcat taagaattac ggctttgtgc acatagaaga 180 
caagacggca gctgaggatg ccatacgcaa cctgcaccac cacaagcctc atggggtgaa 240 
catcaacgcg gaagccagca agaacaagag caaagcccca accaagttac acgtgggcaa 300 
catcagcccc acctgcacca accaagagct tcgagccaag tttgaggagc acggtccggc 360 
catcgaatgt gacatcgcga aagactatgc cttcgcacac atggagcggg cagaggacgc 420 
agcggaggcc atcaggggcc tcgacaacac agagtttcaa ggtgaactgc tctgggcctg 480 
ggtagtagcg ccgagtgggg tctagctcaa aacaggcaag aacacaagac tatagaactt 540 
gctgggtggt ctcttccatt ctgttttagc tggaaataat agattatgtt taccgctctt 600 
aagcataatt tacccctggg gaagcaaaca cttcctcttt tcaggtttgc taagatgttg 560 
ctcaccgact gcatagaatc acaaactgtg ggttacttta ccctgcggga ttcttgcatt 720 
gattcgagtg ctgttggaag tgtaatctgc ttggggaaac gagtacctca tgagagaagg 780 
gaggataaag gtccgtggct tacctgcttc tttggtgatg atcaggaagc cttatatttg 840 
agggtttaag tgcttaagat ttatattctt tactgctttg ggtggatact ggtgggaaag 900 
aagaaaaaag acatctagag gaagccctat attataaatc tgggtggcaa gtctggatct 960 
gcgggagtat ctttttgttg atcaaagttg tgcagtctct tcaagcagag tcaaaaaaac 1020 
atgccatgga gtgttctgct ccacctgttc atttcaccct cagaaaagga aatttctaaa 1080 
tatatcagac tcaatgggaa tgatggtccc gcttctgaag aaatttcagt acaagcatcg 1140 
tagagcatat catactattt ataccgataa taaaggtaca tatgttgtca ttaataccac 1200 
aagaggttgt cagaagactc tagaactgtg ctaatatggt aaccacatgc ggcttagtaa 1260 
attgaaatta acagattaga taaaatttaa aattcagttt ttcaagtgta taccagacac 1320 
gtttcaagca ctcagtagtc atgaggcctg tggctaccgt attaatagag acacagaaca 1380 
tttccatcat catagaacat tcttttggat agcactgttc tacaagtgtt ttgttaacag 1440 
tatcgtcttg gacctcatgt tcatagccac ttttgtggtt cctaagtcaa cacctttttt 1500 
gccctgagtg tcattaaagg ggttgttaag aagtactttt gggtcttcta ttaaaactaa 1560 
aaaacaaaat gagaaaaata atgggagaag aggaaaagtt gaccagagaa gggtaagaaa 1620 
gtttgcatag tggagatggg tagaggagca c ~ 1651 

<210> 18 

<211> 1870 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG:444850 . 9 :2000MAY19 
<220> 

<221> unsure 

<222> 1865, 1867 

<223> a, t, c, g, or other 

<400> 18 

ggctctgaag ccattacaaa ggttgcttaa cttctaatta tttgatcact gaggaaaatc 60 

cagaaagcta cacaacactg aaggggtgaa ataaaagtcc agcgatccag cgaaagaaaa 120 

gagaagtgac agaaacaact ttacctggac tgaagataaa agcacagaca agagaacaat 180 

gccctggaca tggctccaga gatccacatg acaggcccaa tgtgcctcat tgagaacact 240 

aatggggaac tggtggcgaa tccagaagct ctgaaaatcc tgtctgccat tacacagcct 300 

gfcggtggtgg tggcaattgt gggcctctac cgcacaggaa aatcctacct gatgaacaag 360 

ctagctggga agaataaggg cttctctctg ggctccacag tgaaatctca caccaaagga 420 

atctggatgt ggtgtgtgcc tcaccccaaa aagccagaac acaccttagt cctgcttgac 480 

actgagggcc tgggagatgt aaagaagggt gacaaccaga atgactcctg gatcttcacc 540 

ctggccgtcc tcctgagcag cactctcgtg tacaatagca tgggaaccat caaccagcag 600 

gctatggacc aactgtacta tgtgacagag ctgacacatc gaatccgatc aaaatcctca 660 

cctgatgaga atgagaatga ggattcagct gactttgtga gcttcttccc agattttgtg 720 

tggacactga gagatttctc cctggacttg gaagcagatg gacaacccct cacaccagat 780 

gagtacctgg agtattccct gaagctaacg caaggtacca gtcaaaaaga taaaaatttt 840 

aatctgcccc aactctgtat ctggaagttc ttcccaaaga aaaaatgttt tgtcttcgat 9 00 

ctgcccattc accgcaggaa gcttgcccag cttgagaaac tacaagatga agagctggac 960 

cctgaatttg tgcaacaagt agcagacttc tgttcctaca tctttagcaa ttccaaaact 1020 

aaaactcttt caggaggcat caaggtcaat gggcctcgtc tagagagcct agtgctgacc 1080 

tatatcaatg ctatcagcag aggggatctg ccctgcatgg agaacgcagt cctggccttg 1140 

gcccagatag agaactcagc cgcagtgcaa aaggctattg cccactatga ccagcagatg 1200 

ggccagaagg tgcagctgcc cgcagaaacc ctccaggagc tgctggacct gcacagggtt 1260 

agtgagaggg aggccactga agtctatatg aagaactctt tcaaggatgt ggaccatctg 1320 

tttcaaaaga aattagcggc ccagctagac aaaaagcggg atgacttttg taaacagaat 1380 
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caagaagcat catcagatcg ttgctcagct ttacttcagg tcattttcag tcctctagaa 1440 
gaagaagtga aggcgggaat ttattcgaaa ccagggggct attgtctctt tattcagaag 1500 
ctacaagacc tggagaaaaa gtactatgag gaaccaagga aggggataca ggctgaagag 1560 
attctgcaga catacttgaa atccaaggag tctgtgaccg atgcaattct acagacagac 1620 
cagattctca cagaaaagga aaaggagatt gaagtggaat gtgtaaaagc tgaatctgca 1680 
caggcttcag caaaaatggt ggaggaaatg caaataaagt atcagcagat gatggaagag 1740 
aaagagaaga gttatcaaga acatgtgaaa caattgactg agaagatgga gagggagagg 1800 
gcccagttgc tggaagagca agagaagacc ctcactagta aacttcaggt atccaaatgc 1860 
aaaananaaa 1870 

<210> 19 
<211> 628 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc„feature 

<223> Incyte ID No; LG: 4 02231 . 6 :2000MAY19 
<220> 

<221> unsure 

<222> 580, 592 

<223> a, t, c, g, or other 



gcgctctctt ttccaggatc atccagcagc tcgtcaacgg catcatcacg cccgccacca 60 
tccccagcct gggcccctgg ggagtcctgc actcaaaccc tatggactac gcctgggggg 120 
ccaacggcct ggatgccatc atcacacagc tcctcaatca gtttgaaaac acaggccccc 180 
caccggcaga taaagagaaa atccaggccc tccccaccgt ccccgtcact gaggagcacg 240 
taggctccgg gctcgagtgc cctgtgtgca aggacgacta cgcgctgggt gagcgtgtgc 300 
ggcagttgcc ctgcaaccac ctgttccaca caacatacga gcaggcctgg ctggagcagc 360 
acgacagctg ccccgtctgc cgaaaaagcc tcacgggaca gaacacggcc acgaaccccc 420 
ctggcctcac tggggtgagc ttctcctcct cgtcgtcatc gtcctcctcc agctcgccca 480 
gcaacgagaa cgccacaagc aactcgtgag cccacgtcgg ccgtcgggaa agcacggggc 540 
ctttcccacc caccctcagc cagcgccaca cggcacccan agactgggtg cnccggcggc 600 
gccacgcttg gctggtcagc gctgcagg ~ 628 

<210> 20 
<211> 798 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 1076157 . 1 : 2000MAY19 
<220> 

<221> unsure 
<222> 777 

<223> a, t, c, g, or other 



aaaaaaaaat tgctttatgg aagaaagtaa gtatagacag agagaaaggg atctgatgac 60 
caaagcaggg aataaatgtt tggagtccac ggcatcctga gaacttcttg ggaatagagt 120 
ctaggccccc aatgctgtca ctctcaccca tcctcctcta cacatgtgag atgtttcagg 180 
acccagtggc ttttaaggat gtggctgtga acttcaccca ggaggagtgg gctttgctgg 240 
atatttcgca gaggaaactc tacagggaag tgatgctgga aactttcagg aacctgacct 300 
ctatagggaa aaagtggaaa gaccagaaca ttgaatatga gtaccaaaac cccaggagaa 3 60 
acttcaggag tctcatagaa gggaatgtca atgaaattaa agaagacagt cattgtggag 420 
aaacttttac ccaggttcca gatgacaggc tgaacttcca ggagaagaaa gcttctcctg 480 
aagcaaaatc atgtgataac tttgtatgtg gagaagttgg cataggtaac tcatctttta 540 
atatgaacat cagaggtgac attgggcaca aggcatacga gtatcaggac tatgcaccaa 600 
agccatataa gtgtcaacaa cctaagaaag ccttcagata tcacccctcc tttagaacac 660 
aagaaaggaa tcacaccgga gagaaaccct atgcttgtaa agaatgtgga aaaaccttta 720 
tttcccattc aggcattcga agacgcatgg taatgcacag tggggatgga cccttanatg 780 



<400> 



<400> 



20 



taagttttgt gggaaagc 



798 
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<210> 21 

<211> 410 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 1083142 . 1 : 2000MAY19 



<220> 

<221> unsure 
<222> 51 

<223> a, t, c, g, or other 



<400> 21 

ttccgttttc gcgtggttct tttgcaagct ctggattctc tggagtttga ntgtttccag 60 
tattggaacc ccaccaagta ggactgatca ggtcttacaa ttctaaaacc atgacctgtt 120 
ttcaggaatt agtgacattc agggatgtgg ccatagactt ctctcggcag gagtgggaat 180 
acctggaccc taatcagagg gacttataca gggatgtgat gttggagaac tatagaaacc 240 
tggtatcact gggaggacat tccatttcta aaccagttgt ggttgatfcta ctggagcgag 300 
gaaaagagcc ctggatgatt ttgagggaag aaacacagtt cacagatttg gatttacagt 360 
gtgagataat cagctacata gaagtaccca cttatgaaac agatatatcc 410 

<210> 22 

<211> 819 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 1083264 . 1 :2000MAY19 
<400> 22 

cggaagccga ttgcagggag aaactgtttt cgcagcagtg cgcctccctt ttccagccac 60 
cggttctcct gaccccgagt gtggggggtg acttcagtct cctgacatcc agtgttctct 120 
cgagccagtt tccagcccac agaaaatgag ctcttccgga agtgggcatc ttattccaat 180 
cccctccctg tgaatgtgtg gagaaaaaga gatgggaacg aggcagagga aatagagaaa 240 
ttttgaaaga gaaatgaaga atgagagacc cattaacaga aggcaaagta gaaggttcac 300 
aaattttaag aaagggagaa taaagtgaaa aaaatctcag aaggaatcca ctcaacagac 360 
gaggattcac ttccaaagag acatattatg caaggaagca acttggaaga ggaaagaaaa 420 
gaagtcagga atggccctta ctcagggacc cttgaaattc atggatgtgg ccatagagtt 480 
ctctcaggaa gagtggaaat gcctggaccc tgcgcagagg actttataca gggacgtgat 540 
gttggagaat tataggaacc tggtctccct gggaatctgt cttcctgacc tgagtgttac 600 
ctccatgtta gagcaaaaga gagatccctg gactctgcag agtgaagaga aaatagcaaa 660 
cgatccagac ggcagggagt gcatacaaaa ggtgtgaaca cagagaggag ctctaaattg 720 
ggaagtaatg caggaaacaa gaccttgtaa aaatcaaatt ggattcaact tttacagtat 780 
aaattatgag tgatatacag ctaatttcaa gactgaaag 819 

<210> 23 

<211> 2516 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 350793 . 2 :2000MAY19 
<220> 

<221> unsure 

<222> 85, 118, 146 

<223> a, t, c, g, or other 

<400> 23 

agtgtttctc atatctgggg tcactttaga caactgtgtt gaagttggac ggattgccaa 60 

cacctacaat ctaaccgaag tgganaaata cgttaacagt ttcgtcttga agaatttncc 120 

tgcattgctg agcacagggg agttcntgga aactcccttt tgagcgtctt gccttcgtgc 180 

tttccagtaa tagccttaag cactgtactg aacttgagct ctttaaggct acctgtcgtt 240 
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ggcttcgcct ggaagagcct cggatggact ttgctgcaaa attaatgaag aacatacgat 300 
ttccactgat gacaccacag gagctcatta attacgtgca aacggtggat ttcatgagaa 3 60 
ctgacaatac ttgtgtgaat ttgcttttgg aagccagcaa ttaccaaatg atgccatata 420 
tgcagccagt tatgcagtca gacaggactg ccattaggtc tgacaccact cacttggtta 480 
cactaggagg agtgctgagg cagcagctgg ttgtcagtaa ggaattgcgc atgtatgatg 540 
aaaaggccca tgagtggaaa tcgttagccc ccatggatgc cccaaggtac cagcatggca 600 
tcgccgtcat tggaaatttt ctctatgtgg ttggcggaca gagtaattat gatacaaaag 660 
gaaaaacggc agttgataca gtcttcagat ttgatcctcg atacaataaa tggatgcaag 720 
ttgcatcttt aaatgaaaag cgcaccttct tccacctaag tgccctcaaa ggatatctgt 780 
atgcagttgg tgggcgaaat gcagcaggtg aactgcccac agtagaatgt tacaatccaa 840 
gaacaaatga atggacctat gttgccaaaa tgagtgagcc ccactatggc catgctggaa 900 
ctgtgtatgg aggagtgatg tatatttcag gaggaattac tcatgatact ttccaaaagg 960 
agctcatgtg ctttgaccct gatactgaca aatggatcca gaaggcgcca atgaccactg 1020 
tcagaggtct gcattgcatg tgtacagtgg gagaaaggct ctatgtcatt ggtggcaatc 1080 
acttcagagg aacaagtgat tatgatgatg tcctaagctg tgaatactat tcacctatcc 1140 
ttgaccagtg gaccccaatt gctgccatgt taagagggca gagtgatgtt ggggtcgctg 1200 
tcttcgaaaa taaaatctat gtggttgggg ggtattcttg gaataatcgt tgtatggtag 1260 
agatagtgca gaaatatgat ccagataaag atgaatggca taaggttttt gatctgccag 1320 
aatcccttgg tggcattcgt gcttgcacac tcacagtttt tccaccagaa gaaaccacac 1380 
catcaccttc tagagagtcc cctctttctg caccttaaga tcatctctac aactaagatg 1440 
ctgtagttct atctttgcaa tgtgtcataa attctcttct ttttccccct taagtagtat 1500 
atatgttagg attaccctct ggtaattgat acagatattg gaaaaaagac aacattgatg 1560 
ttatttgtgc tctttgtttg gcctagaatg tttataaagt ggtaacacaa ccattctgga 1620 
aatgtatccc atagaagctg atgtttaaca tatgaaaaaa aaagtattgt ctataaaatg 1680 
tttcttcagt actttttaaa tgctgtgtat tgggtgtaag gtatttgtca tcttacatta 1740 
gtaaacccaa taagccaagt tgaaggtgga ttatagtaaa tgtacaactg tgctcactag 1800 
gcttcaagta aaaagttttc ctttcatctt tgactgtaag atgtcaaagg gaggcagcct 1860 
gcttgaacag gaaacaatac acaaaaggtt gccaactcgc atgagctacc tccctctttt 1920 
cataaagtat ttttgacata tctgtcaacc cacttgactg tgtgggtgca ttgagaacac 1980 
aaagtttcct agacacacag gagaagtagc ttaaattcac taatattaat ttaaaaagca 2040 
gcatgaaccc tctacttata aacaagggtt tggtgttttt aaagtgtgta tacatacata 2100 
cacatacaca catgcacata tgtcaaatat aattttttta aaaattgagt ggcacatcaa 2160 
agaaatgtga aattaaaaag aattcttcca aaaagcagct tccattaaaa tgggaattca 2220 
gtatgcacat actgaatgca tatatgtaga accatacaga atttaggtgg ataagggcta 2280 
gaaattttga gcaacaaaat ttgtcacttg accagatttt atcttcaaaa actgtattct 2340 
actccttctc ctttgctgtt gaggtaactt gcatattata tgtattctgt atactcagtt 2400 
cataaggtta tttagcacaa agtatagcag cttcacctgg agagctgctt ttgctcagta 2460 
aattcaactt ccatgtttta tctttttttg ttccaataaa aacatttaat gtcaaa " 2516 

<210> 24 
<211> 1660 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1X3:408751.3 :2000MAY19 
<400> 24 

tagggaccca ggatggcaga tccgggaccg ggctgggctg gcttggaaca tgcttgccaa 60 
ctcagccagc gtgaggatcc tcatcaaggg aggcaaggtg gtgaacgatg actgcaccca 120 
cgaggctgac gtctacatcg agaatggcat catccagcag gtgggccgcg agctcatgat 180 
ccctggcggg gccaaggtga ttgatgccac aggaaaactg gtgatccctg gtggcatcga 240 
caccagcacc cacttccacc agaccttcat gaatgccacg tgcgtggacg acttctacca 3 00 
tgggaccaag gcagcactcg tcggaggcac caccatgatc atcggccacg tcctgcccga 3 60 
caaggagacc tcccttgtgg acgcttatga gaagtgccga ggtctggccg accccaaggt 420 
ctgctgtgat tacgccctcc acgtggggat cacctggtgg gcacccaagg tgaaagcaga 480 
aatggagaca ctggtgaggg agaagggtgt caactcgttc cagatgttca tgacctacaa 540 
ggacctgtac atgcttcgag acagtgagct gtaccaagtg ttgcacgctt gcaaggacat 600 
tggggcaatc gcccgcgtcc atgctgaaaa tggggagctt gtggccgagg gtgctaagga 660 
ggcactggat ttggggatca caggcccaga aggaatcgag atcagccgtc cagaggagct 720 
ggaagctgaa gccactcatc gtgttatcac cagggatggg ggaaaccatg acgccgcctc 780 
ctggtgcagt gcacaccatc tctatccctg tcagccctca ctgggtcatg ggccttgggc 840 
agatgtcaaa gagcccagca gcagcggtgg tggccagctg ggcagagcat ccttgcttgg 900 
gctaggaaag ctttaccttc tctgagtgcc tccgcctgag agatgtgtga cccgtggcac 960 
cagggaacca cgtcttggag tggtccactg taggccatgc gcttcatcca cccccagtcc 1020 
ctacataggc cctacccttg cccgggagct tctagataga aatcagaaag agattcaagg 1080 
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agccaaatga gcggtcagcc 
cttcgtcacc agccccactg 
gcagctctca aactggtctt 
tctctaggtt tgtggggtta 
cgtgtgtgta gtgctaggag 
cttttctgat actgacccag 
gctgaggcgc aggcacagcc 
acttccccac tcccctgtaa 
atggtcctgt gctttggcca 
ctggtgcagg tttgtattaa 

<210> 25 
<211> 2762 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: LI : 33 6120 . 1 .-2000MAY01 

<400> 25 

gaagaccatg agggtgcaca gctggaaaac tctggtgtct cagcttaggg cctcctccgg 60 
gaagagctaa ctgctcccag gtgaagccgg tgcccgcggg cggtccgtac accccgcagc 120 
cggctcgcac cgctcgagag cctcggccgc tgtgtcttcc acgtctgcag ctcagccagg 180 
gcgcgcaggg cgagtggggt ccactggcgg gtaaagggga ccaggacggc gaggatggac 240 
gcacagacct ggcccgtggg ctttcgctgc ctcctccttc tggccctggt tgggtccgcc 3 00 
cgcagcgagg gcgtgcagac ctgcgaagaa gttcggaaac ttttccagtg gcggctgctg 3 60 
ggagctgtca gggggctgcc ggattcgccg cgggcaggac ctgatcttca ggtttgcata 420 
tccaaaaagc ctacatgttg caccaggaag atggaggaga gatatcagat tgcggctcgc 480 
caggatatgc agcagtttct tcaaacggtc cagctctaca ttaaagtttc taatatctcg 540 
aaatgcggct gcttttcaag aaacccttga aactctcatc aaacaagcag aaaattacac 600 
cagtatactt ttttgcagta cctacaggaa catggccttg gaggctgctg cttcggttca 660 
ggagttcttc actgatgtgg ggctgtattt atttggtgcg gatgttaatc ctgaagaatt 720 
tgtaaacaga ttttttgaca gtctttttcc tctggtctac aaccacctca ttaaccctgg 780 
gtgtgactga cagttccctg ggaatactca gaatgcatcc ggatggctcg ccgggatgtg 840 
agtccatttt tgtaaattat tccccaaagg agtaatgggg acagatgggg gaggtccctg 900 
ctgcccagcc gcacttttct gcaggcactc aatctgggca ttgaagtcat caacaccaca 960 
gactatctgc acttctccaa agagtgcagc agagccctcc tgaagatgca atactgcccg 1020 
cactgccaag gcctggcgct cactaagcct tgtatgggat actgcctcaa tgtcatgcga 1080 
ggctgcctgg cgcacatggc ggagcttaat ccacactggc atgcatatat ccggtcgttg 1140 
gaagaactct cggatgcaat gcatggaaca tacgacattg gacacgtgct gctgaacttt 1200 
cacttgcttg ttaatgatgc tgtgttacag gctcacctca atggacaaaa attattggaa 1260 
caggtaaata ggatttgtgg ccgccctgta agaacaccca cacaaagccc ccgttgttct 1320 
tttgatcaga gcaaagagaa gcatggaatg aagaccacca caaggaacag tgaagagacg 1380 
cttgccaaca gaagaaaaga atttatcaac agcctttcga ctgtacaggt cattctatgg 1440 
aggtctagct gatcagcttt gtgctaatga attagctgct gcagatggac ttccctgctg 1500 
gaatggagaa gatatagtaa aaagttatac tcaagcgtgt ggttggaaat gggatcaaag 1560 
cccagtctgg aaatcctgaa gtcaaagtca aaggaattga tcctgtgata aatcagatta 1620 
ttgataaact gaagcatgtt gttcagttgt tacagggtag atcacccaaa gctgacaagt 1680 
gggaacttct tcagctgggc agtggtggag gcatggttga acaagtcagt ggggactgtg 1740 
atgatgaaga tggttgcggg ggatcaggaa gtggagaagt caagaggaca ctgaagatca 1800 
cagactggat gccagatgat atgaacttca gtgatgtaaa gcaaatccat caaacagaca 1860 
ctggcagtac tttagacaca acaggagcag gatgtgcagt ggcgactgaa tctatgacat 1920 
tcactctgat aagtgtggtg atgttacttc ccgggatttg gtaactgaac tcttctgtcc 1980 
tgacatacct tactgaagtc tcgatttctt ctctctctgc atatgcctgg aataagagat 2040 
cctttttcaa tgtaacaatt atatttatga aaagatatgt tacactaact tctcagaagc 2100 
caagctgaaa tattcataaa gtccctaaaa ctcaacgttt aaatgacaca ctttaaaaat 2160 
atgtcttttt tcaatctaac tgaaaacctt cttaacttct aatatattaa atctgaagat 2220 
gtgaagggca cagaagtgac tttgaataag aagaatttag tgtatctgta attttattat 2280 
caattcccaa gccccttcct ttctaaatta aaaatgtttt catttgaaag tgtatttgcc 2340 
agacaatgaa aacagtatgc agtatttctt aaagtattga aattagaata tcatgaaata 2400 
aatcaaaaca tacaatggca agtagtatgc atgcatattc aagagactct tccatttttg 2460 
caagctgtag aaggaaatgt ctgaatgtct ataagttatg gggtagattc ttgagaagca 2520 
tttccatata atttcactga agaaccttga taattttgac ccactgtaac ttagccactg 2580 
atgaacctta aagctgagta ttttattaac acctgatttg tattccatta tattcaaaat 2640 
gcatctttgg tattgtgcct ctgctcccat ctctctcttt gcctcataga tttagctatg 2700 
ttgggaagca catgcttgct ctaggaatat ctccaataaa gctgttaact atttggtgga 2760 



cccaccatgc actccttgcc ccgtgcagag ctccagccag 1140 
gctcctggtt ggaacgaaag ggtctctggt tgcactgaat 1200 
gtacttgctg aataaatact gttgttcttg ccttagctgc 1260 
agttgccaga aaattgtgct actgtgtgtg cgtgtgcgtg 1320 
tccacagtag gtctctgtca agccgatgtc gtgatgaggg 1380 
aagccacaga accacaagga aacccaaacc ccctccagct 1440 
tggggtcgga tggagcctcc agcaccccag cacccaggtg 1500 
atgtcatggt gctaagactg tgtcaacccc aagacgacac 1560 
ccgtttgagg caaaaactaa acagcccgac acgttgtgtt 1620 
actgtagcta cttctcaaaa 1660 
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aa 



2762 



<210> 26 
<211> 4328 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI:234104 .2 :2000MAY01 



<400> 26 

tgcgcccgga 

gcggccgtcg 

cggcaccgga 

ccaggggaca 

accagcaccg 

tctggaggtc 

gctggccgag 

tctgcacccg 

ctgagcagag 

acacgtttct 

cctgcaggga 

acacttcacc 

tctctgccag 

agaccagccg 

agcacaagca 

acagcgtcac 

tggggtgaag 

acctccgcgt 

aatgggacaa 

cttgatttcc 

ccctccacaa 

acaagtccat 

gatgacacct 

tgctaggcca 

ggcccttaaa 

gagcagcaga 

cagctcacac 

agcgaagtca 

gactggcccc 

gccttcactc 

tgtacctgct 

ttccctgcct 

ttctctttca 

ttcattccac 

cttggcccca 

ggccgcacct 

ccgatctgga 

atgtgaaatt 

ttccctcgcc 

teat gat tga 

caatggccaa 

aegtcgaaag 

tgecaagaca 

aeggaaagtg 

tcttcaccac 

acacagtgac 

acatggacca 

tetctgettt 

tcacagataa 

attgtcaatg 

gggcagagac 

ggctgatagt 

gccaagttga 

gatatcaacg 

attaggcatg 



gccggggccg 
cccgtgcccc 
gagategtta 
ggggacaatg 
gggctcctgc 
cacttcgacc 
gtctttgctg 
ttgccccggg 
ccacgagaag 
cactgtggag 
tggggtccga 
cgcccacgag 
ccecgagacc 
gctttcctgg 
cagcccagtg 
ggagtcactg 
gggacccagg 
tttcctgtgc 
tgtgctgtcc 
atcctgcccc 
ggctcagcct 
tccctcctta 
cccggaatgc 
cattctegtt 
ettgetggge 
ggaaggttga 
cctcccaatg 
aagaatgagt 
tgccccattt 
tcagcagagc 
cccacttcgt 
cagcccaggt 
gtcccataag 
ccagtctccc 
gcgcccagcc 
ggctggagag 
ccggagcagc 
tgaccatctg 
ateegtegtg 
gtgaagecac 
gctgacagaa 
cccagtggcc 
catcagccga 
caatgttcat 
attagtggac 
ctggctcttt 
catagaggac 
tttattctca 
atgeceggag 
cttcactggt 
cctggtcttt 
teegggtagg 
tcaaatccaa 
tagggtatta 
aaattaacca 



agtcgctgcc 
tcccagaccg 
aggaggctga 
gccacacgcc 
aggtcaegga 
tcctggacct 
acteggaega 
ccggctacct 
cagcccctag 
aggccccagg 
acacgatggc 
gctccagccg 
ggccttgtct 
ggggacacac 
gccttacgtc 
ecaggaaegt 
gcctgtgaca 
tgecaagett 
agcaaagcac 
ttcttctact 
ctaagacctg 
gctcagaaca 
teteggggtt 
tctgctcaca 
aggtttggag 
caggccctgc 
ettatatget 
ttaatatcaa 
cccatgagca 
tttgggagat 
gagccacccg 
ctccttccct 
gggcagcett 
tcccccgtcc 
ctgctctccg 
gctgggcggg 
cgegagcaga 
attccagttt 
tagtgaattg 
cccgtccgca 
tccatgacta 
attcaccagc 
gateggacca 
cacggcaacg 
ctgaagtgga 
tttggaatga 
ccctcctgga 
atagagacag 
ggaattattc 
gggatagect 
tccacccatg 
ggaccttagg 
acagaactcg 
caegggggat 
acagagtcct 



gcagctgttg 
caccggccgc 
ggtgccgcag 
tgtggaggag 
gaggaggcag 
cactgagctc 
cgagaacctc 
gcgctcccct 
gcgaccccga 
aggactagac 
agate tgggc 
tcacctcctg 
tgctgggcac 
ggggcccccg 
cagctcgttc 
gctgaggaat 
gccactccag 
cagaagecag 
acatggagaa 
ecaeggagtg 
cacctgcttc 
ccaaatatca 
ggggttcacc 
tcccattgcc 
ccccatggga 
tccctctgct 
gaagctcaca 
agtgtaagct 
cacttctggg 
gcccccaggc 
gctgcccctc 
ggtttccagt 
ttgtccctgg 
ctgcccaaac 
cgctcggcca 
cggatgggtg 
atggagtctc 
ttttcttttc 
ttcagtcttg 
gecaggaaaa 
acgtcctgga 
caaagttgee 
aaaggaaaat 
tgagggagac 
gattcaacct 
tctggtggtt 
ctccttgtgt 
aaaccaccat 
ttctcttaat 
gtgtgtgaaa 
cagtgatctc 
aattcccaca 
gagggggagt 
gaeegtctgt 
ttctgggaga 



gggcgcccgg 
atggagcccc 
getgegctgg 
gaggtegggg 
cctctgagca 
accgacatgt 
aacaccgagt 
tcctggacga 
gcggcaggcc 
catctccacc 
cagtgetgae 
acacacaccc 
gggtcttege 
gtatgectet 
ctgggccccg 
ggagtggccc 
gaactcctgg 
atgcgggttt 
gcggccccaa 
cgctgtctca 
tcttggcccc 
ccagactgcc 
tctccttgtc 
eggctacaag 
ccccgtgggt 
ctgggggtgt 
gaatgggctt 
tactttccat 
gaaggacaac 
atgcccgtga 
cgcactgctg 
cacacaagag 
ccactcttat 
gcgcgcccct 
gagggageca 
gaaactcgcg 
ctaacagcct 
cttttctttt 
ctccgtttca 
gcacaaagaa 

gggcgactcc 

taagcaggee 
ccagaggtac 
ctatcgctac 
attgattttt 
gategcatae 
taccaacctc 
tggttatggc 
ccaatactgt 
atctctcaaa 
catgegggat 
ttgtggaggc 
tcatcccgtt 
ttctggtgtc 
tctccaaagc 



gccaggcgac 
eggagggege 
gcgtcccagc 
gcatcccagt 
gcgtctcctc 
eggaccagga 
ccccagcagg 
ggaacaaggg 
acagtcctgg 
tgccccagct 
cccagcagac 
tgggggcagc 
ctcacttgtg 

ggggagcccc 

agtcaggaag 
acggcggcct 
gggtgctcca 
ggtagtggct 
aattcccatc 
ctagtggtcc 
tgcgtgacag 
taagagactt 
ctgcacccac 
gcctgcccac 
ctctgtccag 
ctgggagccc 
cttgcctgac 
ccccaagcca 
aggctccctg 
gctccttctg 
gcaaacccag 
cccagcagct 
ctttccccac 
ccgcccctcc 
gtceggagae 
gaegegggag 
ctcggtgctg 
ttgeatttec 
agagaggaga 
gaaactgcaa 
atggatcagg 
agggatgacc 
gtgaggaaag 
ctgaccgata 
gtcatggttt 
ataeggggag 
aacgggttcg 
tacegggtea 
gttggggtcc 
cccaagaaaa 
gggaaactgt 
ttccatcaga 
gaaccagacg 
accgctgatc 
ccagctgccc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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aaagaggaac tggaaattgt ggtcatccta gaaggaatgg tggaagccac agggatgaca 3360 
tgccaagctc gaagctccta catcaccagt gagatcctgt ggggttaccg gttcacacct 3420 
gtcctgaccc tggaggacgg gttctacgaa gttgactaca acagcttcca tgagacctat 3480 
gagaccagca ccccatccct tagtgccaaa gagctggccg agttagccag cagggcagag 3540 
ctgcccctga gttggtctgt atccagcaaa ctcaaccaac atgcagaact ggagactgaa 3600 
gaggaagaaa agaacctcga agagcaaaca gaaagaaatg gtgatgtggc aaacctggag 3660 
aatgaatcca aagtttagtg ccctagctgg gcaaaccctt ctcttctccc cccaacacaa 3720 
tctttccttg tctctcattc tctttctttt tctgtctctc tggctttgtt ctttatttgt 3780 
ttatatttaa tttttacatg accagaaaac aaatcttcaa ggtgtaaaat atctacctgc 3840 
cctctctcag ttattcagat tgacaaggta gacatggatt tgatgaaagt gcaaagtgcc 3 900 
ctcatttgtg gcccaagcct ggtctcctcc caaaatacta cacatccaac tcctggagat 3960 
ttcagttact tacctgcatg tgttgtacaa taccagatca ctcaaaaagg tgtgtcaaag 4020 
attttacctg ggatatgaca agcaaggttt ctggtgccta tttattcatt cagtgagaca 4080 
cagagtggag ccctcagttt tatggatccc aattcatttc atctactaca gggtgaggtg 4140 
cttgccccca tgtgggtgtg gcagttacag ggcccaggtg agctgaagac aaaccactgt 4200 
acatatatat gccttatgta attattttct ttttgtaatt agtaataaaa cccagcatgt 4260 
acaaaagtac catagaacag aactgctaaa tactgtacat agatgtatca ttaatgtagg 4320 
tttagata 4328 

<210> 27 

<211> 569 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 450887 . 1 : 200 0MAY01 
<400> 27 

cgtcggttca cttctccagg aaagggttcg tactcatggc gccgccgcag ccaaagtcgg 60 
gcctcttcgt tggcatcaac aagggtcatg tcgtcaccaa gcgcgagctg cctccccgcc 120 
cgtgccaccg caaggggaaa tcaacgaaga gggtgtctat ggtcaggggc ctgatcagag 180 
aggttgctgg gtttgctcct tatgagaagc gtatcactga gcttctgaag gttggcaagg 240 
acaagcgtgc cctgaagctt gctaagagaa agcttggaac tcacaagagg gcaaagaaga 300 
agagagagga gatggcgggc gtcctcagga agatgaggtc ggctggtacg cacactgaca 3 60 
aaaagaaata gagagcattt caagttcatg gagctggctg ccagagatta tgttccagtg 420 
tctgattttc catacatgta gaacctaata gacatgtcaa agtattatgt atcgaaccag 480 
ctcatgggat tttgctcctt ccaatgcatc cagggtttat gtatcgaacc aatttatggg 540 
atcttgctct tattctaatg catccatgg 569 

<210> 28 

<211> 3644 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No : LI : 119992 . 3 : 2000MAY01 
<220> 

<221> unsure 
<222> 2628 

<223> a, t f c, g, or other 
<400> 28 

gacaatcttc aggacacact tgaagctgct agctttttta caaatattac ccgttttgga 60 
tttctgtaaa gtatttctta tatcaggagt ctctttggat aactgtgttg aggttggacg 120 
aattgctaac acctacaatc ttatagaagt ggataaatat gttaataatt tcatctctga 180 
agaactttcc tgctttattg agtactgggg agtttctaaa actccctttt gaacgacttg 240 
catttgtgct ttccagtaat agtcttaagc actgtaccga acttgaactc tttaaggcag 300 
cctgtcgctg gctaaggttg gaagaccctc ggatggatta tgctgcaaag ttaatgaaga 3 60 
atattcgatt tccactgatg acaccacagg atctcatcaa ttacgtgcag acagtagatt 420 
tcatgagaac agacaatacc tgcgtgaatt tgcttttgga agctagcaat taccaaatga 480 
tgccatatat gtcagccagt gatgcagtca gatagaactg gcaatcgaac tggattccac 540 
tcacttggtt acattaggag gagttttgag gcagcagctg gttgtcagta aagaattacg 600 
gatgtatgat gaaagggcac aagaatggag atctttagcc ccaatggatg ctccccgtta 660 
ccagcatggt tattggctgt tcattggaaa ctttctttat gtagttggtg gtcagagtaa 720 
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ttatgataca aaaggaaaaa ctgctgttga tacagttttc agatttgatc ctcggtataa 780 

taaatggatg caggttgcat cattaaatga aaagcgcaca ttcfcttcact tgagtgccct 840 

caaaggacat ttgtatgcag ttggtgggcg cagtgcagct ggtgaactgg gcacagtaga 900 

atgttacaac ccaagaatga atgagtggag ctatgttgca aaaatgagtg aaccccacta 960 

tggtcatgct ggaacagtat atggaggctt aatgtatatt tcaggaggaa ttacccatga 1020 

cactttccaa aatgagctca tgtgttttga cccagataca gataaatgga tgcaaaaggc 1080 

tccaatgact acagtcagag gtctgcattg catgtgtaca cgttggagat aagctctatg 1140 

tcattggtgg caatcacttc aagaggaaca agtgattatg atgatgttct aagctgtgaa 1200 

tactattcac caacccttga ccagtggacc ccaattgccg ccatgttaag aggccaaaga 1260 

tgatgttgga gttgccgtct tttggaaaat aaatttaatg ttgttgttga atattctggg 1320 

aataatcgtt gtatggtaga aattgtccag aaatatgacc cagaaaaaga tgagtggcat 1380 

aaagtttttg atcttccaga gtcacttggt gggcattcga gcctgtacac tcacagtttt 1440 

tccacctgaa gaaaaccctg ggtcaccttc tagagaatca cctctttcag caccttcaga 1500 

tcattcttag gtctaaggtg taacaccttt gcagtacgtc gatgggtgat ctaatacttc 1560 

cccttcagtt gtatcttctt acagtgattg gtacagttat tagatataaa ggtaactgat 1620 

gttattcgtc ttgtatggct tttagtatgt gctatcaagt ggctaacaaa tgcattctga 1680 

aaatgtattt aacatagctg tgctaacaaa tgaaaaaaag acgtagaaaa atgtttagat 1740 

gtctttttgt gatgttatat aaaattgtag atgactgtgg taaatgtgta attatgtcca 1800 

ttatgcttca aagttgaagt tttcatcttt gactccaaaa tgtcagaggg aggccgctct 1860 

aaactaaaaa taacgaaagg ttgccaagta ttaatactag ttacctccct cttttcgtag 1920 

tttttgtcat gtctgtcaac ttactcgatt gtgtggttgc attcagaata tttgaagttt 1980 

cttacgtaga cagaaataat aaaaatatta actaggaaaa aacagtatag caccaagcca 2040 

gtatttggta tctctctcta gagcgagcaa gagagggaga gaggaggaaa aaatacacat 2100 

aatacaaaca tacatgcatg cacacataca tacatatgta tacacacaca taatttgaaa 2160 

actgattggc cacttcaacg atggctgaaa ttgtttttaa attgaagttt ctttcttcca 2220 

caaagcagcc cgtttctatt caaatggaaa ttcagtacca gagaataaat gtctatgtag 2280 

tcatactgaa tttagataga taagggctac aagcatacta aatcgagcaa ccaaatttgt 2340 

catgtgacta aacccgttac ttcagatgaa gcttacatta ctgttttctg cttgtgtatt 2400 

ttcccgtaga gtacttttac acagattggt aaaggttcag gtttcacgag aactgctttt 2460 

gtgcagaaaa tttaggttct tttttccacc ttttttgggt cagtaaaact taatgaaaaa 2520 

agcaaagaaa aaaaatattc tggaacaaag ctataagggt tttaaagttc agcctcccaa 2580 

cgttaagtca tcctaacatg attattttgt gatttggggg tgcttgcncc tggtgctgtt 2640 

ccagtccatg tggcatcctg agctgtgtga tctgcctcga ggctatgatc tgagcacgca 2700 

ggagataaca ttttcttctg catcaagtga ggaaaaatgt gcttttgggc catgtctcaa 2760 

agacaggacc aacttcagat ttcccaaaga agccagctac agagcctctg gaacactatg 2820 

gtcttacaag cagtacttaa aatcaaccct cgagcctctt caatgccgaa aggtatcccc 2880 

tatttggttg agaaccacat ggtaattttt aatgggactt tttatcagca aatggagtta 2940 

caggaattct ctgtaatgag tgattctgaa gaggtacttt cctgggaata attatctacc 3 000 

tgaagaaaaa aaattttata tatacattgt gtgtgtgtgt aatacacaca cacacaagcc 3060 

ccctaatacc tggaagafctg tcagcatgta aatcaggaac aactttctcc cttattgaca 3120 

atccccatta attaaaactc aggaaccaag gcaaaatgaa ttggcttcta gggggtctga 3180 

accttactgc cccatacaag tgttgattca ttttaatgct gtttatgatt tctgcattgg 3240 

cagaaaattt tcatactttc tatgtttttt ttaattactc agttttttat tacctaaaaa 3300 

taggcacatt tgagtacatt tgaaaagtag aaaaattaga aattattaac tttattgaat 3360 

aagcaagaag tgcatcctaa tccctttgat tattaatgag gttgaatatt tgtgtgctat 3420 

cggtagctgt gtttctttga tcaaatgttc ctgtcctttt gcccttctgt tatctgttgg 3480 

gagttgcttt gtttttcgta tcaagttata gggatctctt tatataataa atgtaattta 3540 

acttgcattt gcttgcattt atttcttccc tcaatctgtt gtagttttac aaaggcaacg 3 600 

ctgttcagtt aatttttgag atcaaatttg tctttttttt tttt 3 644 

<210> 29 

<211> 2805 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 197241 . 2 : 2000MAY01 
<220> 

<221> unsure 
<222> 325 

<223> a, t, c, g, or other 
<400> 29 

ccccgttccc gattcctgta gtagcggctg tatfcgcagcc gcctgccgaa ctgacccggg 60 

tctggggact ggcccctctg gcgccgttcg gtttctctta ttgccttcac tgaggatgag 120 
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tccctttgtg gctctatgtg gaccctgcgg aatccaccgg cgcagtttca tctagcgact 180 
ggtcaccctt ggcaatttat ggatatttaa acagggtcag acagtgtgga cgggggagtt 240 
ccccctcctc actccccctt ggtgcttgac tccaggaata atttataaac tgtggaattt 3 00 
ttttaaactg aagaacttgt atttncgata tgaactttat agaagctatt tataactttt 3 60 
tttggattta agctggccaa aaaattgcta taacagatat atacgtttta tactattgtc 420 
aggcaggatt taacattatc ctaaaaaggt aatttattct ctgtaacttc ctcaatagca 480 
cctttgtgtc ctggcttttt cattttttaa aattagtttt cacgattctg aagtaagtgg 540 
tataaaaaca gttagggatg agttcaccca tgcctgactg cacatcaaag tgtcgatccc 600 
tgaagcatgc tttggaagtc ccttctgtgg taacaaaggg gagcgaaaac ccgattaagg 660 
cccttctctc cacgtcattg ttacaaagct gccactatca aggatgtttt tggcaggaat 720 
gccctccacc cctgtttcct cctcgtggag aagaaaggag tgttagattg gcttattcag 780 
aaaggagtgg atctgttggt gaaagaccaa gagtctggat ggacagcctt gcaccagaag 840 
cactttttta tggacatatt gattgtgttt ggtctctatt gaagcatggt gttagtctgt 900 
atattcaaga taaagaaggc ttgtcagctt tggatcttgt aatgaaggat agaccaactc 960 
atgtagtatt caagaatact gatcctacag atgtttatac ttggggcgat aatacaaatt 1020 
ttaccctggg tcatggcaag ccagaatagc aaacatcatc cagagttggt ggatctgttc 1080 
tccaggagtg ggatttatat caagcaggtg gtgctttgta aatttcactc cgtgtttctg 1140 
tctcagaaag ggcaggttta tacctgtggt catggtcctg ggagggcgat tagggacatg 1200 
ggagatgaac agacatgctt ggtccctcgg cttgtggaag gactgaatgg tcataattgt 1260 
tcccaagtgg cagctgctaa ggatcatact gttgtattaa ctgaagatgg atgtgtttat 1320 
acatttggtc taaacatttt tcatcaatta ggaattattc caccgccttc cagttgtaat 1380 
gtacccagac agatacaggc aaaatatctg aaaggaagga caatcattgg cgttgcagca 1440 
ggcaggtttc atacagtcct atggactaga gaagctgttt acactatggg actacatggt 1500 
ggacaactcg gttgtttgct agatcccaat ggagaaaagt gtgtaactgc tcctcgtcag 1560 
gtctctgccc ttcaccataa agacattgct ctgtctttgg ttgctgcaag tgatggagct. 1620 
acagtctgtg ttaccacaag gggagatatt tacttacttg .cagactatca gtgcaagaag 1680 
atggcttcta aacagttgaa cttgaaaaaa gttcttgtgt ctgggggtca tatggaatac 1740 
aaggttgatc ctgaacattt gaaagaaaat gggggtcaaa aaatttgcat tcttgcaatg 1800 
gatggagctg gaagggtgtt ttgctggaga tcagtcaaca gttctctgaa gcagtgtcga 1860 
ttgggcctat ccacgtcagg gtcttcattt ctgatatggc tttaaataga aatgaaattc 1920 
tatttgttaa cgcaaggatg gagaaggatt tagagggaga tggtttgaag agaaaagaaa 1980 
gagttctgga aaagaaagag attttatcaa accttcacga ttcctcatca gatgtgtctt 2040 
atgtctctga tataaatagt gtgtatgaaa gaattcgact tgagaaactt acctttgcac 2100 
atagagcctg ttagtgtcag cacagatcca agtggatgca actttgcaat cctgcagtca 2160 
gatcctaaaa caagccttta tgaaaattcc agctgtgtcc tcatcatcct tttttgaaga- 2220 
gtttggcaaa ctgttgaggg aagcagatga aatggacagc attcatgatg tgacatttca 2280 
agttggcaat agactcttcc ctgcacataa atatattttg gcagtgcatt ctgatttttt 2340 
ttcagaaatt gtttcttttc agatggtaat acttcagaat ttacagatat ttaccagaaa 2400 
gatgaagatt ctgccagggt gccatctctt tgtggtagag aaggttcatc cctgacatgt 2460 
ttgaatacct tttacaattt atatacacag atacttgtga ctttttaact ccatggcttc 2520 
aaacccaaga atacacttaa acaaaaaccc agaagaacta tcagggaact ctgaattctc 2580 
atttgaataa agtgaatttc catgaagatg ataaccagaa gtctgcattt gaagtttaca 2640 
aaagtaatca agctcaaaca gttagtgaga ggcagaagag caaacctaaa tcttgtgaaa 2700 
caaggcaaaa atattaggga agatgatcct gtaagaatgt tgcaaactgt gtggaagaaa 2760 
ttcgacttca gtaatttgag tagtaggtta gatggagtca gattt 2805 

<210> 30 

<211> 572 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 406860 . 20 :2000MAY01 
<400> 30 

gtttgtatgt gatgctggag atgactcggc cttcttcact gtcactgtca cagctggcac 60 
tgttctcaag agctgtgctg ccagtgggga gggctgagga tctggcgggt gaggcaggag 120 
aggcctgctg gccaagccta tgtgcccctc tccatgccca cccaccagcc ccaccagaga 180 
ggattgtgca cccggcagcc cgctccctgg atctgcattt tggggctcca gggcgcgtgg 240 
agctgcgctg tgaggtggcc ccagctgggt ctcaggtgcg ctggtacaag gacgggctgg 300 
aagtggaggc atcagatgcc ctgcagctgg gtgccgaggg gcccacccgc accctgaccc 360 
tgccccacgc ccagcctgag gacgccgggg agtatgtgtg tgagacccgg catgaggcca 420 
tcaccttcaa tgtcatcctg gctgagcctc cagtgcagtt ccttgctcta gagacaactc 480 
caagcccgct ctgtgttggc cccggggagc cagtggtgca ggagggcgag ggcctagagc 540 
tccatgccga gggccccgcc gagtctctgc at 572 
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<210> 31 
<211> 1082 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 142384 . 1 : 2000MAY01 



<400> 31 

ggcggacgtg ctgccgagta gtcccggaag cgaagcagcg atggcggaga gtccgactga 60 
ggaggcggca acggcgggcg ccggggcggc gggccccggg gcgagcagcg ttgctggtgt 120 
tgttggcgtt agcggcagcg gcggcgggtt cgggccgcct ttcctgccgg atgtgtgggc 180 
ggcggcggcg gagtgtgggc ggggccgggg gcccggggag cggcctggct ccgctgcccg 240 
ggctcccgcc ctcagccgct gcccacgggg ccgcgctgct tagccactgg gaccccacgc 300 
tcagctccga ctgggacggc gagcgcaccg cgccgcagtg tctactccgg atcaagcggg 360 
atatcatgtc catttataag gagcctcctc caggaatgtt cgttgtacct gatactgttg 420 
acatgactaa gattcatgca ttgatcacag gcccatttga cactccttat gaagggggtt 480 
tcttcctgtt cgtgtttcgg tgtccgcccg actatcccat ccacccacct cgggtcaaac 540 
tgatgacaac gggcaataac acagtgaggt ttaaccccaa cttctaccgc aatgggaaag 600 
tctgcttgag tattctaggt acatggactg gacctgcctg gagcccagcc cagagcatct 660 
cctcagtgct catctctatc cagtccctga tgactgagaa cccctatcac aatgagcccg 720 
gctttgaaca ggagagacat ccaggagaca gcaaaaacta taatgaatgt atccggcacg 780 
agaccatcag agttgcagtc tgtgacatga tggaaggaaa gtgtccctgt cctgaacccc 840 
tacgaggggt gatggagaag tcctttctgg agtattacga cttctattag ggtggctgca 900 
aagatcgcct gcaccttcaa ggccaaacta tgcaggaccc ttttggagag aagcggggcc 960 
actttgacta ccagtccctc ttgatgcgcc tgggactgat acgtcagaaa gtgctggaga 1020 
ggctccataa tgagaatgca gaaatggact ctgatagcag ttcatctggg acagagacag 1080 
ac 1082 

<210> 32 
<211> 2497 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 895427 . 1 : 2000MAY01 
<220> 

<221> unsure 
<222> 1938 

<223> a, t, c, g, or other 
<400> 32 

tagcctgcac ctgtacggtc tcggggggct gcggccagcg ccgggggcca cccccaggga 60 
cctctgctgc ctactgcaag tggatgggga ggccagggcc cgaacagggc cactgccacg 120 
ggggccggac ttcctgctgg ctggaccaca ccttccacct ggagctggag gccgccaggc 180 
tcctgcgcgc cctggtgctt gcgtgggacc ctggcgtgag aaggcaccgg ccctgtgccc 240 
agggcaccgt gctgctgccc acggtcttcc gagggtgcca ggcccaacag ctggccgtgc 300 
gcctggagcc tcaggggctg ctgtatgcca agctgaccct gtcggagcag caggaagccc 3 60 
ctgccacagc tgagccccgc gtctttgggc ttgcccctgc cactgctggt ggagcgggag 420 
cggccccccg gccaggtgcc cctacatcat ccagaagtgc gttgggcaga tcgagcgccg 480 
agggctgcgg gtagtgggac tgtaccgtct ttgtggctca gcggcagtga agaaagagct 540 
tcgggatgcc tttgagcggg acagtgcagc ggtctgccta tctgaggacc tgtaccccga 600 
tatcaatgtc atcactggca tcctcaagga ttatcttcga gagttgccca ccccactcat 660 
cacccagccc ctgtataagg tggtactgga ggccatggca ccgggcaccc cccaaacaga 720 
gttcccccca ccactgaggg cacccgaggg ctcctacagc tgcctgccag atgtggaaag 780 
ggccacgctg acgcttctcc tggaccacct gcgcctcgtc tcctccttcc atgcctacaa 840 
ccgcatgacc ccacagaact tggccgtgtg cttcgggcct gtgctgctgc cggcacgcca 900 
ggcgcccaca aggcctcgtg cccgcagctc cggcccaggc cttgccagtg cagtggactt 960 
caagcaccac atcgaggtgc tgcactacct gctgcagtct tggccagatc cccgcctgcc 1020 
ccgacaatct ccagatgtcg cgccttactt gcgacccaaa cgacagccac ctctgcacct 1080 
gccgctggca gaccccgaag tggtgactcg gccccgcggt cgaggaggcc ccgaaagccc 1140 
cccgagcaac cgctacgccg gcgactggag cgtttgcggg cggggacttc ctgacctgtg 1200 
ggcgggattt cctgtccggg ccagactacg accattgtga cgggcagtga cagcgaggac 1260 
gaggacgagg aggtcggcga gccgagggtc accggtgact tcgaagacga cttcgatgcg 1320 
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cccttcaacc cgcacctgaa tctgcaaaga cttcgacgcc ctcatcctgg gatctggaga 1380 
gagagctctc caaagcaaaa tcaacgtgtg ccttctgagc ccagatgacg gcggtgggga 1440 
ccccggttag taaggaccgg gcgcccagtg gctaaggcgg tgccctggtg accaaggacg 1500 
agccagacct gttgctcagg ccgagctcct gggttgccag cgagttacca cggggaccag 1560 
tcgcgtgtat ggcttgagac ttcattccca gtttccaggg cccggctatt tggacactag 1620 
ttgccaagtc tggggcctgg ggatttcacg ggaccagcgg cttgtgaccc atctttcctg 1680 
agcaccaagg gcttcccctt ttgttgccac aaacggtcgt cctcgcgctt gctagcgctg 1740 
gcctctcttg cctccccttg gccggggcaa caccagttac tgtgagcatc accctgggtg 1800 
tgggtgagtc acctctagta cggccctctt gctgctgcca accaaatcag tattagcttt 1860 
gagcactgca ctgtttctcc ctcccttggg acggacacaa agactaggca tgaggcactc 1920 
tttgtggggg gcagcccnct atccctgggt tccaagcatg ggacacaggg ggtagcctgg 1980 
gggcttatag acggaacaca gcttgtttcc ccctccactt tccccgggga aaaccccacc 2040 
caatggcctt ttagcagcca atggagataa cagagttctg gccccttccc atccccatct 2100 
ccttgccccc cccttgcccc cccccccgaa aaaaatgtga gcacgttaaa cccctccctt 2160 
ttggaggggg ccccctgaag cgtcaggcct gggggcagtt tggtacggga acatatttac 2220 
ttgcctccca tgcatgtgct gtgtgtgtct gtgaggcacg ggtgtgcgtg gacacagtct 2280 
gaaggcaagg catggtgagg gctctattca tgggaccaca gcaggaggga gcagtttgcc 2340 
atgccccacc caccctggaa tcccccatat atggtgcctc agtgggcccc cgagttccag 2400 
tgggagagtg acggttccct cctgtctccc tcttcttttc cgcacctccg atctttgtgg 2460 
ataataaata aatatgcaca ggttctgaaa aaaaaaa 2497 

<210> 33 
<211> 2876 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 757439 . 1 : 2000MAY01 
<220> 

<221> unsure 

<222> 1472, 1495, 2463 

<223> a, t, c, g, or other 

<400> 33 

cggaagccgc ggtagcggag aagactggag ctccgaggag ctgcatctgc ggcaacctgt 60 
gtgctgacgc tacgtgcctc ctggcttccg acgtagctcg cagctcccca gtctcactcc 120 
attccttccc cacctggcgc gcacctgctc aagaccaggg tcctgccaag cgctaggagg 180 
gcgcgtgcca ggggcgctag ggaactgcgg agcgcgcgcg ccatggggcc gccgcctggg 240 
gccggggtct cctgccgcgg tggctgcggc ttttccagat tgctggcatg gtgcttcctg 300 
ctggccctga gtccgcaggc acccggttcc cggggggctg aagcagtgtg gaccgcgtac 3 60 
ctcaacgtgt cctggcgggt tccgcacacg ggagtgaacc gtacggtgtg ggagctgagc 420 
gaggagggcg tgtacggccc ggactcgccg ctggagcctg tggctggggt cctggtaccg 480 
cccgacgggc ccggggcgct taacgcctgt aacccgcaca cgaatttcac ggtgcccacg 540 
gtttggggaa gcaccgtgca agtctcttgg ttgggcctca tccaacgcgg cgggggctgc 600 
accttcgcag acaagatcca tctggcttat gagagagggg cgtctggagc cgtcatcttt 660 
aacttccccg ggacccgcaa tgaggtcatc cccatgtctc acccgggtgc agtagacatt 720 
gttgcaatca tgattcggca atctgaaagg cacaaaaatt ctgcaatcta ttcaaagagg 780 
catacaagtg acaatggtca tagaagtagg gaaaaaacat ggcccttggg tgaatcacta 840 
ttcaattttt ttcgttttct gtgtcctttt ttattattac ggcgggcaac tgtgggctat 900 
tttatctttt attctgctcg aaggctacgg aatgcaagag ctcaaagcag gaagcagagg 960 
ccaattaaag gcagatgcta aaaaagctat tggaaggctt tcaactacgc acactgaaac 1020 
aaggagacaa gggaaattgg ccctgatggg agatagttgt gctgtgfcgca ttgaattgta 1080 
taaaccaaat gatttggtac gcatcttaac gtgcaaccat attttccata agacatgtgt 1140 
tgacccatgg ctgttagaac acaggacttg ccccatgtgc aaatgtgaca tactcaaagc 1200 
tttgggaatt gaggtggatg ttgaagatgg atcagtgtcc tttacaagtt tccctgatat 1260 
ccaatagaaa tatctaatag tgcctcctcc catgaagagg ataatcgcag cgagaccgca 1320 
tcatctggat atgcttcagt acagggaaca gatgaaccgc cttctggagg aacacgtgca 1380 
gtcaacaaaa tgaaagtcta cagctggtaa aaccatgaag caaattctgg tggcagtgga 1440 
tgttattcct catgttgaca acccaaccct tntttttgga agaagactgg aaaanctcct 1500 
aatcaagaga ctgctgttcg agaaattaaa tcttaaaatc tgtgtaaata gaaaactgtg 1560 
aaccattaag taataacaga actgccaatc agggcctagt ttctattaat aaattggata 1620 
aatttaataa aataagagtg atactgaaag tgctcagatg actaatatta tgctatagtt 1680 
aaatggctta aaatatttaa cctgttaact tttttccaca aactcattat aatatttttc 1740 
ataggcaagt ttcctctcag tagtgataac aacattttta gacattcaaa actgtcttca 1800 
agaagtcacg tttttcatct tataacaatt ttcttataaa aacatgttgc ttcttaaaat 1860 
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gtggagtagc ctgtaatcac tttattttat gatagtatct taatgaaaaa tactacttct 1920 
ttagcttggg ctacatgtgt cagggttttt ctccaggtgc ttatattgat ctggaattgt 1980 
aatgtaaaaa gcaatgcaaa cttaggcgag tacttcttga aaatgtctat ttaagctgct 2040 
ttaagttaat agaaaagatt aaagcaaaat attcattttt tactttttct tatttttaaa 2100 
attaggctga atgtacttca tgtgatttgt caaccatagt ttatcagaga ttatgggact 2160 
gaattgattg gtatattagt gacatcaact tgacactaga ttagacataa aattccttac 2220 
aaaaatactg tgtaactatt tctcaaactt gtgggatttt tcaaaagctc agtatatgaa 2280 
tcattcatac tgtttgaaat tgcgtaatga ccagagtaag taacactgaa tattgggcca 2340 
ttgatcctcc gttccatgaa ttagtctacc agaaaaaaaa tggttctgta aaaattagtc 2400 
ctgttggaaa atggtttttc caaacaatgt ttactttgaa aattgagttt atgtttgacc 2460 
ctnaatgggc gtaaaattac attagaataa acgtaaaatt gctgtgccgt gtaactgata 2520 
aattattgtg aaatgcatta ttcactggtg tattgaaaaa agaagaggga gggagaatta 2580 
ccaggtgcca ttaataataa agatttgaag ctatcattcc accaatagtt aaatttagag 2640 
actaatttaa aatatgcaca tttaatttgt acatctgtga tggcttattg tatatagaat 2700 
atttgtatac aaatatatag cagaatttag gcaaaaaata aaacagacafc gtatttttgt 2760 
gtgctgaatg gatgaaacca attgcattct tgtacactga tatacaaatg ctgtaaatat 2820 
gtccccattt ttattgattc tctttaaata taaaatgtaa ataaaatatt ccaata 287 6 

<210> 34 
<211> 1288 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 1144066 . 1 : 2000MAY01 
<220> 

<221> unsure 
<222> 1243 

<223> a, t, c, g, or other 
<400> 34 

ggggtgcgac gccgagggcg ggggagcgcg cgccgctgct cccggaccgg gccgcgcacg 60 
ccgcctcagg aaccatcact gttgctggga ggcgacctgt acaaatccta agcgaatttt 120 
ttggagcatt ttcaccccgg aaactcgcca tccagaagtg tgcttcccgc acagctgcag 180 
ccatggggtc tgaggaccac ggcgcccaga aacccagctg taaaatcatg acgtttcgcc 240 
caaccatggg agaatttaaa gacttcaaca aatacgtggg ctacatagag tcgcagggag 300 
cccaccgggc gggcctgggc aagatcatcc ccccgaagga gtggaagccg cggcagacgt 3 60 
atgatgacat cgacgacgtg gtgatcccgg ggcccatcca gcaggtggtg acgggccagt 420 
cgggcctctt cacgcagtac aatatccaga agaagggcat gacagtgggc gagtaccgcc 480 
gcctgggcaa cagcgagaag tactgtaccc cgcgggacca ggactttgac gaccttgaac 540 
gcaaatactg ggaaggaacg ctcaccttgt gtctccccga tctacggggc tgacatcagc 600 
ggctcttggt atgatgacga cgtggcccag tggaacatcg ggagcctccg gaccatcctg 660 
gacatggtgg agcgcgagtg cggcaccatc atcgagggcg tgaacacgcc ctacctgtac 720 
ttcggcatgt ggaagaccac cttcgcctgg cacaccgagg acatggtacc tgtacagcat 780 
caactacctg cactttgggg agcctaagtc ctggtgagtg tctacactgg ccctgccgcc 840 
ggccggaccg agagcccctc gggagggagt caatcccggg tacacggctg ggcgccgtgg 900 
caggggcccc accaggtgag gccgcaaagg tcggcctatg acggctggag atcttccgga 960 
ccgcctgggg tcacccacca gctttggggt gggggatgtg cacccccaga gccgaagctc 102 0 
ccaggcccct agagcttgcg ctttgtaccc cggagtgccc cccattgagc tgtgagcggc 1080 
cccaggtgtc cccatggcca ggagcgtggt cttgagcctc ctgagctgcc caggctgtgc 1140 
tgcctcacag ccaagtggag acgttcctgg tgaagggaca ctgtccatgc tgcccagagg 1200 
ggcctggcca ggatgaccct gcagccgctc cctcgcagtc tcngccctgg cacgtctggg 1260 
ccaggcccta cagttaggag ggcagggc ^ 1288 

<210> 35 
<211> 5271 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 243 660 . 4 : 2000MAY01 
<220> 

<221> unsure 
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<222> 3667 

<223> a, t, c, g, or other 
<400> 35 

tgaccctgag cggccccctg gagccacatg 
accctttggg ggtggtggaa tctggtaagg 
ccggcccccc aggagagccc cggctggaca 
ttcttttcaa taaggtctcc tgagatccag 
cagtccctca gtgccctacc caaggaacgg 
cccaacaagc gggagaagga ccacatccgg 
aagcggcacc gggccctggt ccaccagcgt 
aagatcctgg gcgagtggtg gtatgccctg 
ctggccttcc aggtgaagga ggcccacttc 
caaggaccga aagaagtcca gctcagaggc 
ggcacaagga gacgcgggag cggagcatgt 
tgtgtcctcg tgagctcctg tccgttgcag 
ctccggggag cagctcctgt ggggcagaac 
cccggccccg agctttctcc cacagcgggg 
gtcaggcgct acaggaactg acgcagatgg 
agccttctac ccagtatgga gctccaggac 
tggcggccac tgggcggccc ccgctgctgc 
ccagtgagga catgacgagt gatgaggagc 
atgatgtcat tgctgacgat ggcttcggcc 
gggtgaccga cagcgagagt ggggacagct 
ttggtcggaa ggtgttttca cctgtgatcc 
tggaccctga gcccccaggg cccccggatc 
ccgccccatc ctcctctgcg tcctcgcctg 
tctcactggg ctcaggaacc ttcaaggccc 
ccctacggcc cccaccccct ggggctgggg 
tcctcccaat ggatcctgcc accttccggc 
agccaccagg cccctcagtc atcgcggccc 
cactggtgct gcccccaaac aaggaggagc 
cccccgcccc atcactggcc tatggggccc 
ccatggtcac caatgtggtg cggcctgtca 
ccttccccac ctctggccgg gctgaggcgt 
aaatgggcac tgggtctcgg gtgcctgggg 
cggacaagaa gtcggcagca gccacctcac 
tgggcactgt ggggaaggcg cctgccactg 
atggggcccc tgcgccccct gctgtccagt 
ccactgcggg ctcaggagca ggtgctggga 
tcctgcaacc aggtgccctg ggcaaggctg 
ccacgctgcc ccagcagctt caggtggcac 
cagcggctcc catgcggccc tgcacccacc 
acttccacca acggcaaagt cctggctgcc 
cagtctgtac cctccgcccc accccccaaa 
cccccgggtg gctcagccca gctgctgcct 
agcatgtcat ttgcggggtg gaaggggcgg 
cccttctcag tacctgtgca aaatggtgcc 
ccggtgccct gtgagcacac ccagcggcct 
tccctggacc cacctgctca atctcagaag 
cctatgtgca gtcagcgggc gggcacgcgc 
aggctggaac agtcacctcg tacgggccca 
tggggcccag cggccccgcc ttcgtgcagc 
tggctcccgg tcaggtgggc gtgtcacctg 
cagcccccgg aggtcctgtc ataacagcat 
cagcacccct tggcccagcc atcccaggcc 
agcacaaccc cacctgcagc caccattctg 
accccagccc cgactagccc tttccccagc 
gtggccccca aggcccagcg gcccagcccg 
gccagcattc ccgtggggtc ctttgaggca 
cggcaggcgt ctggagcctg gcccagtccg 
gtgggcagcc cacaccacca gcccctccac 
ggagcagccc ccacacgtgg ccccgcacac 
ggccctgaga cccatggcca gcaaattccc 
gcagggcctg gagaatcgtg gggagcctcc 
tgtagcnccc tggtggcagc agcgagagca 
gagcgcaatg gaggctgtgc tggtactggc 



ccctgagagc ccaggacccg gaccccccac 60 
gtccgcctcc caccacggag gaggaggcct 120 
gtgagacaga gagtgaccat gatgatggct 180 
ttgcctctac cgcccggaaa acgtcggacc 240 
gactcatctt ctgagaagga tggacgcagc 300 
cggcccatga atgccttcat gatcttcagc 360 
catcccaacc aggacaaccg gaccgtcagc 420 
gggcccaagg agaagcagaa gtaccacgac 480 
aaggcccacc cagcattgga agtggtgcaa 540 
caagcccacg gagcctgggg ctggcaggag 600 
cggagacggg cactgctgcg tgcccctggg 660 
cccagacact cctgagctca gacaccaagg 720 
ggctacacac agttggggga cctggctcag 780 
tacacagcct ggacggcgga gaagtagaca 840 
tgtctggccc tgcatcgtac tctggcccaa 900 
cctttgcagc ccctggtgag ggaggtgcct 960 
ccacccgagc ttctcgttct cagcgtgcgg 1020 
gcatggtcat ctgtgaggag gaaggggatg 1080 
ccactgacct tgatctcaag tgcaaggagc 1140 
ctggggagga cccagagggc aacaagggct 1200 
gttcctcctt tacccactgc cgccccccac 1260 
ctcctgtagc ctttggcaaa ggctatggtt 1320 
cttcctcctc agcctcggca gccacctcct 1380 
aggagtctgg tcagggcagc acagcgggcc 1440 
gtccagcgac accttccaag gcaacccggt 1500 
gcaagagacc cgaaagtgtg ggtggcctgg 1560 
ctcccagcgg aggaggaaac atcctgcaga 1620 
aagagggcgg cggagccaga gtgccctccg 1680' 
cagcagctcc cctgtcccgt cctgccgcca 1740 
gcagcactcc tgtgcccatc gcctctaagc 1800 
ctccaaatga cacagcaggt gccaggactg 1860 
gctccccgct gggtgtcagc ttagtgtatt 1920 
cagccccaca cttggtggct ggacccctgc 1980 
tcactaacct actggtgggc accccggggt 2040 
tcattgccca gggggcccct ggtggtggga 2100 
gtggccccaa tgggccagta cccctgggca 2160 
ggggaatcac ccaggtacag tacatcctgc 2220 
ctgccccagc accagcccct gggaccaagg 2280 
accagcatcc gtttcaccct cccaccgggc 2340 
actgcaccca ctcctggcat ccccatcctg 2400 
gcccagtcag tttctcccgt gcaggccccg 2460 
gggaaggtcc tagtgcctct ggccgcccct 2520 
gacagccaca tgccgacatg gtgagcccag 2580 
cagtccccca gaaagatcat ccagctgacc 2640 
ggtgcctgcc cctgaggacc ccagacacac 2700 
gtccttgttg acctcactcc accagaatca 2760 
tgcccctggg taccagccct gcgtccagcc 2820 
cgagctctgt agctctaggc ttcacctcgc 2880 
ccctgctctc agcaggccaa gccccactgc 2940 
tgcccagtcc ccagctgccg cctgcctgtg 3000 
tttactctgg cagccctgca cccacctcct 3060 
cccccaagcc tggtctacac tgtggccacc 3120 
cccaagggcc cgccagcccc tgccactgcc 3180 
gccacagcag gttccatgac ctacagctta 3240 
aaggcccccc agaaagtgaa ggcagccatc 33 00 
ggtgcctctg ggcggcctgg ccctgcaccc 33 60 
agagccaact gccccagagt ctgagcttga 3420 
ccctgccaga gacctggact cccacggccc 3480 
tgcgtgaagg agcaggacca tgcggccaag 3540 
cagctcatct tcagactggc gcgtccctgg 3 600 
cactcctccc agcccggccc cagctccagc 3660 
gcagtgggcg ggcagccggg gacaccccgt 3720 
aagaaggtga aggtgcggcc cccgacccct 3780 
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gaagaagacc tttgactctg tggacaacag ggtcctgtca gaagtggact tcgaagagcg 3 840 
ctttgctgag ttgcctgagt ttcggcctga ggaggttgct gcctccccca acctgcagtc 3900 
tctgggccac ctcacacccc gggcccatcc atgggcatct ttaccgcaaa gaagaggaag 3 960 
aactccacgg acctggattc agcacccgag gaccccacct cgcccaagcg caagatgaga 4020 
agacgctcca gctgcagctc ggagccctac acccccaaga gtgccaagtg cgagggggac 4080 
atcttcacct ttgaccgtac aggtacagaa gccgaggacg tgcttgggga gctagagtat 4140 
gacaaggtgc catacgtcct cacctgcggc agcatccctg gaccacgcgc cagggccctg 4200 
gtcatgcagc tctttcagga ccatggcttc ttcccgtcag cccaggccac agccgccttc 4260 
caggcccgct atgcagacat ctttccctcc aaggttgtgt ctgcagttga agattcgtga 4320 
ggtgcgccag aagatcatgc aggctgccaa ctcccaccgg agcagccccc tggagctgag 4380 
gctcctctcc ctgtaccgcc ccccactggc accgctgcat gcacctgccc ccactcccag 4440 
ccccgcaggg ggccctgacc ccacctcacc cagctcggac tctggcacgg cccagtgctg 4500 
ccccgcacac tgcctcacac ccccagagtc ggggcctgga cagcctggct gggagtgggg 4560 
ctccccagcc cttccccccc acccccaggt ccctccaaag gttgccacag gcagggtgag 4620 
ggacccctcg agaagatgcc aggacttata gtaccccctc aggacatgga cagtatgtgg 4680 
gggcaggaag gttatctcct cccgggtaaa gccatttgcg tcctctccag tttggggcgg 4740 
aatgaggcct gctcctcttg taaatacccc cttccctctg aagctccctc ccggtgctgg 4800 
ggggcagctg ccggggagag ctgcaggggc aagtctccct cctccaagcc cctgtacata 4860 
acctggcagc gtgtgacctt cagagctttt cactttatgc aaaaatggct cctgtgaggg 4920 
ctgcaaggct ggagggtggt gcaggccttg ggcccacagg gagtgcgcct gtggaatagg 4980 
ggggagtttc atgcacccct ttttttcccc agagggggct ggactcaggg ttagtttgag 5040 
gggtgggggc tccctgcact ttgccacaag gccacgggga gggttttctc cttcaccccc 5100 
ttctgccctc ccaacttggg ttgtactttc taaagaaggt gattcccccg tggcccttgg 5160 
gccccttccc caaggaacaa aacattgttg atcatggtgc aatatttctt actgataccg 5220 
agaagccgca atgagcgaga ttaaagcctg tttacacaaa aaaaaaaaag g 5271 

<210> 36 

<211> 6070 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__feature 

<223> Incyte ID No: LI : 334386 . 1 : 2000MAY01 
<400> 36 

ggctagacta ggacatacca aggtggttaa ttgtttgatt gggtgtggag caaatattaa 60 

tcatactgat caagatggtt ggacagcatt aagatctgct gcttggggtg gccatactga 120 

ggtagtttct gcactacttt atgctggcgt aaaagtggat tgtgcagatg ctgatagccg 180 

aacagctttg agagcagcag catggggagg acacgaggat attgtactga atttgctaca 240 

acatggcgct gaagtgaaca aagctgataa tgaaggtaga actgctttga tagcagcagc 300 

atacatggga catagagaga ttgtggaaca cctactggac catggagcag aagtaaatca 360 

tgaggatgtt gatggcagga ctgcactctc tgtagctgca ctttgtgtgc ctgcaagtaa 420 

agggcacgca tcagttgtta gccttttaat tgatcgaggt gctgaagtag atcattgtga 480 

taaagatggc atgactccac tgctggtagc tggctatgaa ggacatgttg actgtggttg 540 

acttgcttct agaaggggga gcagatgtag atcacacaga taacaatggc cgtacacccc 600 

tcttagcagc agcgtctatg ggtcatgcat cagttgtaaa tacacttttg ttttggggtg 660 

cagctgtgga tagtattgat agtgaaggta ggacagtcct cagtatagct tcagcacaag 720 

gaaatgttga ggtggtacgt actctactgg atagagggtt agatgaaaat cacagagatg 780 

atgctggatg gacacctttg cacatggcag cttttgaagg gcacagattg atatgtgaag 840 

cacttattga acaaggtgct agaacaaatg agattgacaa tgatggacga atccctttca 900 

tattagcttc acaagagggt cattatgatt gtgttcaaat attactggaa aacaaatcca 960 

acattgatca aagaggttat gatggaagaa atgcactgcg ggttgctgca ttagaagggc 1020 

acagggacat tgttgaattg ctttttagcc atggtgctga tgttaactgc aaagatgctg 1080 

atggtcggcc tacactttat atcttggcct tagaaaatca gcttacaatg gccgaatatt 1140 

ttttagaaaa tggtgcaaac gtagaagcaa gtgatgctga aggaaggaca gcacttcatg 1200 

tgtcttgttg gcaaggccat atgggaaatg gtgcaggtcc tgatagcata ccatgcagac 1260 

gtcaatgctg cagacaatga aaagcgctct gctttgcagt ctgcagcctg gcagggccat 1320 

gtaaaagtgg ttcagcttct gattgagcat ggtgctgtag ttgaccatac atgtaaccaa 1380 

ggtgcaactg cactctgtat tgcagcccag gaagggcaca ttggttgtgg ttcaggtctt 1440 

attagagcat ggtgctgatc caaaccatgc tgatcaattt ggacgcactg ctatgcgtgt 1500 

tgcagccaaa aatggacatt ctcagataat taaattatta gaaaaatatg gtgcatctag 1560 

tttgaatggc tgttccccat ctcctgttca cacaatggag caaaaacctc tacagtcatt 1620 

gtcttcaaaa gtgcagtcat taacaattaa atcaaatagc tctggtagta ctggtggagg 1680 

ggatatgcag ccttcgttac gtggtttacc ctaatgggcc tactcatgct tttagttctc 1740 

cttcagaatc tccagattct acagttgacc ggcagaagtc atcactgtca aataattccc 1800 

tgaaaagctc aaaaaattca tctttgagaa ctacttcatc tacagcaacg gctcaaacag 1860 
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tgccaattga tagctttcat aacttgtcat 
cacgcagtag taagtcgaca gtcaattgtt 
cagagtcata attcaccaag tagtgaattt 
tcaactaaag caagtaaagg ggggaaatca 
aaaaaagcga aacaaagtaa ttcttcacag 
tttgatagga agaggaccta ttaggccaaa 
atgccagcag aatctcaatg caaaattatg 
tctcaacagc agtttcttat tcaccaacaa 
aatgacaaat ccaaattatc atcttcagag 
cccacgaaca atgcaagata gagggcatca 
gacagaatta agcctttaaa caagctctga 
tcaactataa aaaggaaaca ccattataaa 
tgtgatggag tggttcttca gctactggat 
aaacataaaa aatgaagaat gtgatcttct 
tgcctaaata gtaaggctgc cttctcaatg 
gtgtgctttg tattcactac acaggaataa 
agttccctga taaaagctga aaagaaaatt 
aaatgtgcat gtgccataat caaatatata 
tttttctttt tgtggcaaaa gaaacttaaa 
gtatggcctg tttcttgtat cttgacaaga 
tgttctatgt tcactcacac cttcagcatt 
cattgaaatg aaaaaatgat tgctcgcgca 
gggacgctgc tttaagtagt tccgatctgg 
gctaaacatg tcagaacaaa caaccaacca 
ccgcctggga tacaatttca tctttccatt 
agacttatca ttcctgaggc atgaaaattc 
tgtgtgcaga gagaaatgca cctgtctatc 
tcattgacta aactgacctc ttcctcctgg 
ctcagcggtt tctattttgt aaattgctgc 
tggattaatt ttaatggttt ggttggatga 
cttcacagat gacaccactc ccctgtcaat 
taatagtgat ggatttgcac ttttctatcc 
aattgcatgc aggagggctg gatgccaggg 
aaattggttc aaatgagcat gtgtcccaca 
agctgtagct tagatgggtc gtttatatgt 
caaggttgtg gatttgatct tagaatgggc 
ttcttttcat gtactgttga tgaggaatga 
accattgcta tttctagaac aaaatcttta 
gtcatcttca tatatgaagg acacatgtgg 
atfcatgataa atgtttcact tgaatcttat 
aagtcagcaa aactcatttt gtctgtcatc 
ttgataagtt caattgatat taaggtgaca 
ttcaagtatt gaattcttgt ccctatggga 
acaaagtagc cttaaaagtg ttaatgagtc 
agaaccaact ttactcatat ctcaaataca 
gctctaattt ctattacatg taattttctt 
ttctattaac tttgaattat tgaaattgta 
gcaccttgtg tccaaagggc gaagatatga 
agctaactgt tttcatatcc atactatata 
attggaaaaa tacgattttc aaaagtaagt 
ttttggaata gaacataaat gactttgagt 
gatgccctgc cttgcttatt tgtttctgtg 
tctgctccat gtgggaaagt aaaaattgtg 
tttttgaaag ttaaaatctt taaattttat 
atggcatgtg cttataacat ttttataaca 
gagtttataa taacggaaag tgataaaaat 
ctaaggaaag gttttggaag gaaaaatgac 
ccgaataatt tttgttcttt gttttcaccc 
gtttgatacc ttaaaaaaca gaagtaaata 
aatatttgtt ttgcttaatg tcctccattt 
aaatgctttc tttttatttt tcccttgggt 
attggaattt cacattgtaa aagttttatt 
tccctgtgta tatgaaagtg ccataataaa 
taattttgtc cttgaaccag tagaacatgc 
gttgttttaa gttattatct taaataaatc 
aagaagaaag tgaaatataa cactgtccaa 
tatgtcttcc taagcctttt taaagactta 



ttacagaaca aattcagcag cattcattgc 1920 
tccccatctt ccacaacaca gtccttagga 1980 
gagtggagtc aagtaaagcc cagtttgaag 2040 
gaaaattctg ccaagtctgg atcagctggg 2100 
ccaaaggttt tagaatatga aatgactcag 2160 
tccgggactg gctggcaccg ccttaaacaa 2220 
ataccttcag ctcagcagga aattggtcga 2280 
agtggtggaa cagaagaaga gaaatggaat 2340 
caaccaggtt tttcttggta gggtttcagt 2400 
ggaagtgttg gagggatacc cttcctcaga 2460 
agcttcagat tgaaggttct gaccctagct 2520 
agtttcctat tctgtgaaac agaaggacat 2580 
gggaaacata tgcctgttga tttgctgaaa 2640 
ggcagtacag ttaccttaat tactgtaatg 2700 
taaccctctg tgcttaaaaa atttcatttt 2760 
gcacttttta aaaatgcaga tacatactgc 2820 
tgagtatttt aagttaaaat tgtgataaaa 2880 
tgaaaaggca gtgttccttg tatttatttt 2940 
catactgttt cagtcacatt gcattgtagt 3000 
cgtagctcaa taaacacata tcttgcaacg 3060 
ggataaaaat cat t tec tat ataaatatca 3120 
gtttacaaca actcatttta tagtacttta 3180 
actctcccag tagaattctt ctcatctctg 3240 
gtctgttggc agaacaaagt cctatttcat 33 00 
cacctttgtc attccacctc ctaagaagac 33 60 
tcagggacaa agccatgcct cagtcacatg 3420 
taagggtaga tttttgatcc ctggaataat 3480 
gctaaataaa ttaattttgc tggcttctct 3540 
atgaccaaaa tagccccact caaaatcaat 3600 
atattctgga tgaatataaa atgtgctgcc 3660 
catagcacat gtgtactttt tattgttact 3720 
tcatactctt tcctgttttc ttctttgtac 3780 
ttaagagaga tattcatgac aaggaaggta 3840 
gecttagtet ccttactctt aaatcagtgg 3900 
ctggagaagt tgtcataaca gtttagaagc 3960 
ctgttaatct tatagaaccc agaaattctg 4020 
ggaaagagaa tttggagatt cagcacacag 4080 
gaatatgtcc taataacaaa ggtcagtaat 4140 
atacattgtt cgtggaaata gaaatgtata 4200 
tggtaaggct tttcttgett ttatttttta 4260 
tataagtcat agtgaggact acgagataag 4320 
tggcaatatt aataactcaa atgtgaatgt 43 80 
catttattaa ataaaattaa tgggactctt 4440 
tattaataaa tatgaacaca tactattatt 4500 
gtacatttac attactggta aagggatgaa 4560 
tagaaagaga accctgaaag ctgccagttt 4620 
tttatttaat tttattgttt ttacaaaatt 4680 
cattgeataa gggatttatg tttttcaaag 4740 
cacttgaagc aattggtagg aagagaacct 4800 
attccteggg atgtttttat ataaattatg 4860 
taggataagg atgataggat gggtgctgga 4920 
ggggatttaa tactataaat gaaaatggct 4980 
cctccaaata taaaactcct tcaatagata 5040 
aaatcagctg ttgccacgta cacaactatg 5100 
ttccgtctgt gtctattccc taaatatcat 5160 
acagtaatga aataaaaatc tggaatgttt 5220 
ctgaaatcag tgcgtactgc aaaggtacag 5280 
ccaccctcac tggaattctc accaaaaata 5340 
cttttcctca atattgtttt gggctatgea 5400 
acacttgctc agcaaatggt agttgcaaac 5460 
ttgagggtat gtaaatagee aaaaatgtac 5520 
ttatcccttg gtatgatatt actcaaaaaa 5580 
tatatttget ttacagagaa gatcttgttt 5640 
atgatatgea tatagcataa acaactgtta 5700 
ctgagcaaat gaatttggaa acattttgea 5760 
aggaaggtag aaaaacaaag atttactgtt 5820 
atgttctttt cccccccccg tgactgatta 5880 
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tatactatat cacaccatgt caggttttgt gcctctgaga attgcagtaa tccaataact 5940 
tttgtatatg tgtgctcctt gatcatcaga atattatggc catcttatgg cggatatttt 6000 
gggagtttat tgcaaacatg gtcattcatt ttctaaataa aatttgtgtg tttcttcact 6060 
cagtaaaaaa 6070 

<210> 37 
<211> 3474 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 347572 . 1 : 2000MAY01 
<400> 37 

gtcattcagt ggatgtgatc tgtggctcac aggggacgat gtcaagctcc ttcctggctc 60 
cttctcagcc ttgttgcctg taactggctg ctcagtccac cattgaggaa caggccaaga 120 
catttttgga caagtttaac cacgaagccg aagacctgtt ctatcaaagt tcacgttgct 180 
tccttggaat tataacacca atattactga agagaatgtc caacaacatg caataagttg 240 
ctggcgagac aaatgtgtct agcccttttt acaaggaaca gtccacactt gcccaagatg 300 
tatccactac aagcaaactt cacgacatct ccacatgtca acgcttcagc tgtgcacggc 360 
ttcttcaagc cataaaactg tgagtcttca ggttggtcat cacgaagcac agagagcaaa 420 
ccggttgaac acaatttcta atatacaaat ggagccacca atcctaacag taactggaaa 480 
acgtcgtaac ccagataatc cacaagaatg cttattactt gaaccaggtt tgaatgaaat 540 
aatggcaaac agtttagact acaatgagag gctctgggct tgggaaagct ggagatctga 600 
ggtcggcaag cagctgaggc cattatatga agagtatgtg gtcttgaaaa atgagatggc 660 
aagagcaaat cattatgagg acttattggg gattattgga gaggagacta tgaagtaaat 720 
ggggtaaata gtggatatga ttacagccgc ggccagttga ttgaagatgt ggaacatacc 780 
tgttgaagag attaaaccat tgataggaac atcttcagcc ctatgtgagg gccaagttga 840 
tgaatgccta tccttcctat atcagtccaa ttggatgcct ccctgctcat ttgcttggtg 900 
atatgtgcgg gtagattttg gacaaatctg tactctttga cagttccctt tggacagaaa 960 
ccaaacatag atgttactga tgcaatggtg gaccaggcct gggatgcaca gagaatattc 1020 
aaggagtccg cagaacttct ttgtatctgt tggtcttcct tatatgactc taggattctg 1080 
cggcaaattc catgctatac ggacccagga aatgttcaga aagcactctg ccatccccac 1140 
agcttgggac ctggggaagg gcgacttcag agatccttat gtgcacaaag ggtaacaatg 1200 
gacgacttcc tgacagctca tcatgagatg gggcatatcc agtatgatat ggcatatgcc 1260 
ggccaacctt tttctgctaa ggaaatggag cttaatgaag gattccatga agctgttggg 1320 
gaaatcatgt cactttctgc agccacacct aagcatttaa aatccattgg tcttctgtca 1380 
cccgagtttt caacgaacga caatgaaaca gaaataaact tcctgctcaa acaagcactc 1440 
acgattgttg ggactctgcc atttacttac atgttagaga agtggaggtg gatggtcttt 1500 
aaacggggaa attcccaaag accagtgggt gaaaaaggtg gtgggagatg aagcgaaaga 1560 
atagttgggg tgtgtggaac ctgtgcccca tgatgaaaca tatctgtgac cccgcatctc 1620 
tgttccatgt ttctaatgat tactcattca ttcgatatta cacaaggacc ctgttaccaa 1680 
ttccagtttc aaagaagcac ttttgtcaag cagctaaaca tgaaggccct ctgcacaaat 1740 
tgtgacattc tcaaattcta cagaacgtcg tggacagaac actgttcaat atgctgaggc 1800 
ttggaaaact cagaaccctg gaccctagca ttggaaaatg ttgtaaggac caaagaacat 1860 
gaatgtaagg ccacctgctc aactactttg agcccttatt tacctggctg aaagaccaga 1920 
acaagaattc ttttgtggga tggagtaccg actggagtcc atatgcagac cacagcatca 1980 
caagtgagga taagcctaaa atcagctctt ggcagataaa gcatatgaat ggaacgacca 2040 
atgaaatgta cctgttccga tcatctggtt ggatattgtt aattgaggca gtacttttta 2100 
acaagtaaaa aatcagatga ttctttttgg ggaggaggat gtgcgagtgg ctaatttgaa 2160 
accaagaatc tcctttaatt tctttgtcac tgcacctaaa aatgtgtctg gatatcattc 2220 
ctagaaactg aagttgaaaa ggccatcagg atgtcccgga gccgtactcc atgatgcttt 2280 
ccgtctgaat gacgacagcc tagagtttct ggggatacac ccaacacttg gacctcctaa 2340 
ccagccccct gtttccatat ggctgattgt ttttggagtt gtgatgggag tgataattgt 2400 
tggccatggt catcctggat cttcactgga atcagagatc ggaagaagaa aaataaagca 2460 
agaagtggag aataatcctt tatgcctcca tcgatattag ctaaggagta taaataatcc 2520 
aggattccga aacactgatg atgttcagac ctccttttag aaaaatctat gtttttcctc 2580 
ttgaggtgat tttgttgtat gtaaatgtta atttcatggt atagaaaata taagatgata 2640 
aagatatcat taaatgtcaa aactatgact ctgttcagaa aaaatattgt ccaaagacaa 2700 
caagtgccaa ggagagagca tcttcattga cattgctttc aagtatttat ttctgtctct 2760 
ggatttgact tctgttctgt ttcttaataa ggattttgta ttagagtata ttagggaaag 2820 
tgtgtatttg gtctcacagg ctgttcaggg ataatctaca atgtaaatgt ctgtctgaat 2880 
ttcttgaagt tgaaaatcaa ggatatatca ttggagcata gtgttggatc ttgtatggaa 2940 
tatggatgga tcacttgtaa ggatcagtgc ctgggaactg gtgtagcttg caaggattga 3000 
gaatggcagt gcattagctc acttgtcact ggcatccatt ggtcaaggac tgacatgctt 3060 
tccttcacag tgaactcagt tcaagtficta tggtgatttg cctacagtga tgttgtggaa 3120 
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tctgatctat 
aggtagagga 
aacactgaaa 
gaaatctcta 
tgttcaccct 
tgagcagtgc 



gctttccttc 
catttgcttt 
tctagagctc 
tttctagctg 
ctgaagtggg 
tgagcccaaa 



aaggttgaca 
ttcacttcca 
aggggctctc 
ttctctaact 
tacccagtct 
gcagacactc 



ggtcctaaag 
aggtgtcttg 
gcgtgaatct 
gtcggagtga 
cttaaatctt 
aataaatgct 



agagacagaa 
tatcaacatc 
cccagagaca 
tatggaatat 
ttgtatttgc 
agatttaccc 



tccagggtac 3180 
ttcctgtaca 3240 
tgcctgtata 3300 
tccaactgta 3360 
tcacagtgtt 3420 
cctc 3474 



<210> 38 

<211> 3474 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No: 



LI : 817314 . 1 :2000MAY01 



<400> 38 

gctttcagag 

tcgttttcct 

ccgggcgtgc 

agcccgaaga 

ttctattaca 

agagcagaat 

gattatgcca 

aattgcattg 

gage teat eg 

atcagaaaag 

gaaaaacagg 

acaccaatca 

aaaggagtct 

agttcagatg 

gccagtccct 

agttgggaac 

ctgtcacggc 

gaactggaaa 

aatgatcttg 

cccaattgtc 

agacactggg 

tctgtgtgct 

atcaagttta 

tctcagcaca 

tggatgatat 

ggeggactte 

ttatatttag 

ccacgagaat 

gcaaacatct 

cctctgcaaa 

tgccttgtgt 

acgaaagggt 

ttatttgaga 

accaatgtca 

attacaatgt 

cttaccaact 

ggatgagtta 

ccaagtctct 

gaagaaagee 

ataacttgag 

ttgctgcaat 

gaactaaagc 

aaactttcca 

gatgaaaaga 

gatttaacca 

agcaatggct 

tttgtgaccg 

gagcaaaatg 

ggaccacttg 

tgagcattcc 



catcctcact 
tggaatgetc 
tctcaccccg 
ttgcaacttt 
aaagaaatgt 
cagaactctc 
gtgtcaagaa 
atcctctcgg 
aactactctt 
aagtegtegg 
tgcctcctat 
ttttggcagc 
cagtgcctcg 
tggacagect 
ctctcattgc 
ttcaggaact 
agtgcaaaca 
tcattcttaa 
caagactaaa 
aacagctget 
cagtgaagat 
acctgatagc 
tctgccacac 
tcgacaggtc 
taccgtgggt 
aggactacat 
caacaatctc 
catgggacat 
tcagttctct 
tatctctggg 
tgctagcatt 
taacctgcaa 
cactgcagtc 
aagcacagca 
catctctctg 
gattgetgae 
ttttgaagaa 
ctggtacctg 
agaaagtttt 
aagacatcac 
gattagagat 
aagacatttc 
caatacaatc 
gtgatagega 
ccctgattca 
ctgccctggt 
atatcaaaaa 
caaaccaaat 
agagaaatat 
eggtctcagt 



ccgcccagtt 
caaaactcag 
cacggcaccc 
gcagagacga 
taatgctccc 
gecatcagaa 
atccctagag 
aagaactget 
aagctttaat 
agctgttgag 
actccttgat 
ccatacaaat 
accccacgag 
ccgtcactca 
actgtcaagc 
gagcaaggtg 
atttgetaag 
ttaccgagat 
attggecatt 
ggcatctcgc 
ggtgacatgt 
tcccaaaagc 
agcctcctat 
agacttgaac 
cctgggcttc 
ccatgattgg 
cttgaaaatt 
gtggcatccc 
gegtctgate 
aagaatgetc 
tgcaaatggc 
aggcataaga 
cctgttttgg 
tgaatttact 
gttgttctac 
catgeagata 
ggaggtactc 
atcaaatgga 
ggaacaatag 
caataccaag 
gctaaagact 
tagtttcege 
tgegaatgee 
aggtaatagc 
tccgagatca 
ggttcaggag 
ctttgggtta 
cttctctgtt 
tcaatctgga 
gaacaatgtg 



cggtgccagc 
cagegactaa 
gcgccgtcag 
agaaatagca 
tatagagacc 
aaagectact 
gaagctgaaa 
ctcctcattg 
gtctatgttg 
ctgttattga 
aagcagttct 
aattatgaga 
gtccgctgta 
cgctccagac 
gaagatcctt 
gaaaatgaat 
gacctactgg 
gacaatagtc 
aagtaccgtc 
tggtacgatg 
ttcataatag 
ccacttggac 
ttgacttttt 
aggcaaggtc 
atatggggag 
tggaatctaa 
gttgcatttg 
actctggtgg 
tcactgttta 
ctggacattt 
ctaaatcaat 
tgtgaaaagc 
tcaatatttg 
gagtttgttg 
tcaacatgtt 
tagaatggaa 
tgcctactcc 
tctggacaca 
gggtaagaac 
aagttatgag 
gaagaaggee 
tttgaagtcc 
tcgaaggagt 
aaggacaaga 
gcagcaattg 
ccgcccaggg 
tttcatagac 
tcagaagaag 
atctcgagga 
tgttagtaga 



tgcgtgggct 
gggaattcca 
tcctcggatc 
tggcatgaaa 
gcatccctct 
tgaatgctgt 
tttattttaa 
caattgaaaa 
gagatgetet 
accacaaaaa 
ctgaattcac 
taataaaact 
actgtgtgga 
tcaacatcta 
ttctcacagc 
teaagtegga 
atcagacgag 
tcatagaaga 
aaaaagagtt 
agtttccagg 
gacttctttt 
tgttcatcag 
tgttcctgct 
caccaccaac 
aaattaaaca 
tggactttgt 
taaagtacag 
cagaggcttt 
ctgeaaatte 
tgaagtttct 
tgtacttcta 
agaataatgc 
ggctcatcaa 
gtgccaccct 
aatagctatg 
atttgeaega 
cttcaatgtc 
cttgtgcaag 
acagcatagg 
gaacctggtg 
tgaccgaaga 
tgggattact 
cttcaaattc 
aaaagaattt 
cctctgaaag 
agaagcagag 
gatcaaaaca 
ttgetegtea 
ttagcttcat 
ccatagagta 



ccagcttcga 
ttggaatttg 
ccatcacttc 
tatggctcag 
aaggatagta 
ggaaaaggga 
aatcaatatt 
tgagaacttg 
attacatget 
acctagtgga 
tccagacatt 
cttggttcag 
atgcgtgtcc 
caaggecttg 
ctttcagtta 
gtatgaagag 
aagttccaga 
acaaagtgga 
tgttgcccag 
ctggaggaga 
tcctgtcttc 
gaagccattt 
gctgcttgcc 
categtcgag 
gatgtgggat 
aatgaactcc 
tgcccttaat 
atttgetatt 
tcacctggga 
attcatatac 
ttatgaagaa 
attttcaacg 
tttatatgtg 
gtttggggac 
atgaataatt 
acaaagcttt 
atcccgagcc 
aaaaagatga 
egagctgetg 
aagcgatacg 
gaactttaag 
aagaggaagc 
ggcagactca 
cagecttttt 
acataacata 
aaaagtgaat 
aaatgetget 
acaggctgea 
eggggtgace 
aggaataegg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 
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acacactggg 
ggtggaggac 
tatagatcta 
gattgtgata 
tgggtggggg 
taatcaagat 
cctttttgtt 
taaataaatg 



gttacaggta 
acggttccta 
tgatctaaac 
cttgaaggag 
aaaatgttta 
gtggaatatt 
ttgtagcctg 
caccttgtat 



ggaaagagag 
taataccaaa 
ctcccagaca 
gaagcgttta 
aattgtatta 
acctgtaaca 
cttttgcttt 
tcttgtactg 



tcgtgtccat 
ggagaaacat 
cagtcaccca 
ccatacacat 
gcaaatgact 
tgtttaaatt 
cacaatttgt 
ttgcaataac 



<210> 39 

<211> 1613 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: LI : 000290 . 1 : 2000MAY01 



tcaagtcaga 
gcaaaagaag 
cgaagattac 
acgtattttc 
aaattacact 
aaggcaaagg 
cttacaattg 
ccacagaaac 



gaaggtgtgt 
aggactctag 
gtgaccacaa 
cgtagtgctc 
ttatagcgtt 
caatcaaaaa 
tttttgttaa 
attt 



3060 
3120 
3180 
3240 
3300 
3360 
3420 
3474 



<400> 39 

gtgagggctc 

taacgatgcc 

tagtggagac 

atgctcatag 

tgtgatgctg 

cctgagaaag 

acttcaggtt 

ttggaaccag 

aagaggtatt 

gtccccagaa 

acgggacctt 

tatgactgga 

agaagccgaa 

ggaaagatcg 

ctgcaccacc 

ctgttactgt 

tattactatt 

ggcgatgcag 

atcatggaaa 

caccccctcc 

caggtccagg 

ctggccatag 

tgcttcccat 

ctgggacaag 

cccagaacct 

ctgctgcatc 

tgccacctcg 



ttgggttagt 
cggtgggctg 
ccacggcagg 
aggatgtgga 
atccttcagc 
aattaaaagc 
tcgtggacaa 
taaaagcctg 
tcaggagcca 
gactcgttca 
ggcgagacta 
gaagaggcag 
gtagggggcg 
aagtttaaga 
tccaaactct 
gatcgcaccc 
aaaaattatt 
agattatgat 
tgatccccta 
tcctgggctt 
cccaggcccg 
tatgagactt 
accaagacca 
tgcagtgccc 
cctttacaca 
tgagcgactt 
aaacctcatg 



tcctgttagg 
cgggcaccgg 
cctgaagaag 
tgccctcaag 
cttagccaac 
cttttgtgct 
actatttgac 
agccaaaacc 
gcagaggaag 
caatctactg 
tgaccggtac 
gagtcagagt 
cacgcaaaga 
gtgacaggaa 
tctgagcagt 
tggtcaccca 
agctcttcca 
gaaagaggat 
gttgttgatg 
cctcctccaa 
ggcccaggtc 
cctgttcccc 
cctataacac 
aatcttgcat 
gtatcagaac 
cagttgggga 
ggatcctcca 



ccccggccgg 
gagctgtgaa 
agcggcggcc 
tcctggctgg 
tatgttgtag 
gatcaacttg 
agtctctata 
actagttcca 
aacgagatgg 
aacgaaggac 
tatgagcgga 
cggagtaaga 
ccgggatcca 
tgacctggag 
attcctctgg 
ctcttgaaaa 
attcttttgg 
tttgtgtact 
aagttgctct 
caactcctgg 
caggcccagg 
aaggacatgg 
aatcaagctt 
cagtgggaac 
gacagcccat 
caccgcctcc 
ttggatacca 



gggagtaggt 
gggaacgtga 
gagcccgcct 
ccaagttact 
cactggtcaa 
atgtcttttt 
ctaagaacta 
agaaaaagac 
cagaaaaaag 
acgtgagaac 
atgaattgta 
gtcgaggcct 
aataggaatg 
aagttcctat 
ggcacagtct 
cacaacttgg 
tcgaaaccta 
tggtgacctt 
gccaagtatg 
aatgttaatg 
cccgggccca 
tcagcctcca 
gataaacagc 
aagactacct 
gtactctcgt 
tctgttggca 
tacctcagtc 



tgaagtctcc 
gggggcggcg 
tccctgcacc 
ggagccgata 
gaaggacaaa 
acaaaaagaa 
ccttccactt 
gaaattaccg 
aaatatccta 
aaaagagacg 
ccgtgagaag 
gagtcgcagt 
ttgagcacag 
gtgcctgtgt 
attcccagca 
gagttggtct 
ccaccaaaga 
tgtcagtttg 
attcctttcc 
cctccaatgc 
ggtccaggtc 
ccatccgttg 
cgtgaccagc 
cctcctttac 
gaacatggtg 
gctcgtttgg 
tec 



<210> 40 

<211> 1056 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 
<223> Incyte ID No: 



LI : 023518 . 3 : 2000MAY01 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 - 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1613 



<400> 40 

ccagaggaaa 

ataattactc 

agctgtttca 

acttcaacgt 

tgacccagca 

aagactgeat 

tatacaagtc 

cagaagttat 

ttaactcctt 

ggcaactcta 



ctagtcacaa 
gcacttttcc 
ccctaccttg 
ggaccatggc 
agactatatc 
gtctgaaaat 
ttaccttgag 
gtgtcccatt 
tggcactgaa 
tcctgagggg 



aaaccctgac 
caggctagtg 
gcttcaatct 
tacctggagg 
aacctggtcc 
gctctagatg 
gcattctata 
cttgagtttg 
ttgagcaaag 
ctgcggctgt 



tatcacctga 
caaatcttca 
cttcccccat 
gcctggttcg 
agtgtgagac 
aactgaatat 
aattctgtaa 
aggecgacag 
aagaccgaga 
tggctcaggc 



tagattgett 
ggggccgtcc 
gctcgaaggt 
aggatgeaag 
cctagaagct 
tgaattgeta 
gaatcatggt 
acgtgctttt 
gaccctctat 
ggaagacttt 



gtgctgcctg 60 
aggactacag 120 
gcggagctgt 180 
gccagcctcc 240 
ccattcttcc 300 
cgcaataaac 3 60 
gatgtcacag 420 
atcatcactc 480 
ccaacctttc 540 
gaccagatga 600 
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agaacgtagc ggatcattac ggagtataca aacctttatt tgaagctgta ggtggcagtg 660 
ggggaaagac attggaggac gtgttttacg agcgtgaggt acaaatgaat gtgctggcat 720 
tcaacagaca gttccactac ggtgtgtttt atgcatatgt aaagctgaag gaacaggaaa 780 
ttagaaatat tgtgtggata gcagaatgta tttcacagag gcatcgaact aaaatcaaca 840 
gttacattcc aattttataa cccaagtaag gttctcaaat gtagaaaatt ataaatgtta 900 
aaaggaagtt attgaagaaa ataaaagaaa ttatgttata ttatctagac tacacataag 960 
taagccacac tatatcttca tgagttgcaa atccatggaa acacagtaaa ccaggcctga 1020 
aacaaagcat ttccttggtt tcagtggtat tagatc 1056 

<210> 41 
<211> 3806 
<212> UNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 1084246 . 1 :2000MAY01 
<400> 41 

cgttacaagc agtgcaggtt taccaacggc ttggggcagc gatatactaa acaaatttaa 60 
tttaaaagca actgtgtgac gattcctcca agcaagaaat tggaattgaa tgtctcaagt 120 
ctcgttgcgg ttgctgaggg gattggatat agggacctgg actccaacat gaagaagcta 180 
gggagaattc atccaaacag gcaagtgttg gcctttattt tgatggtgtt cttgtctcag 240 
gttcgcctcg agcctattcg ttattctgtg ttggaggaaa cagagagcgg ctcctttgta 300 
gcccatctgg ccaaggatct gggcctggga attggggaac tggcctcccg gtcagcccgg 360 
gtgctgtctg acgatgacaa gcagcgtttg cagctggatc gtcagactgg agatttgctt 420 
ctgagggaga aactagaccg ggaagagctc tgtggtccta ttgaaccgtg tgtactgcat 480 
ttccaagtgt tcctggaaat gccggtgcaa ttttttcata ggagaattat tgatccagga 540 
tcatatatgt atcactctcc aatattccct gaaagggaag tgctcttgaa aatactagaa 600 
aatagtccag ccgggtactc tatttccgtt gctaatagct gaggatttgg atgtgggcag 660 
caatggtctt caaataatac acaatcagcc ccaattctca ttttcacatt ctcactcgaa 720 
atcatagtga gggcaagaaa tacccagatt tggtgcagga caaaccacta gatcgagagg 780 
agtcagcctg agttacagct taaccctcgt ggcgctggat ggtgggtcac cacctaggtc 840 
tggcacggtc atggttcgaa tcctgatcat ggacatcaat gacaatgctc ctgagtttgt 900 
gcacactcca tatggggtgc aggtcctgga aaacagcccc ctagactctc caattgttag 960 
ggtcttagct agagatatag atgctggaaa cttcgggagt gtttcttatg gcttattcca 1020 
agcatcagat gaaattaaac aaactttctc aataaatgaa gtcacgggag aaatactgtt 1080 
gaaaaaaaaa ttggatttcg aaaaaattaa atcttaccat gtagaaattg aggccacaga 1140 
tggaggaggc ctttctggaa aaggcactgt agtcatagag gtggtggatg tgaatgacaa 1200 
tcccccagaa cttatcatat cttcactcac cagctccatc ccagaaaatg ctcctgagac 1260 
ggtagtctct atcttccgaa ttcgagatag agattccgga gaaaatggaa agatgatttg 1320 
ctctattcca gataatctac cgtttattct aaaaccaact ttgaagaatt tttacaccct 1380 
ggtaacgggg gtgaccactg gaccgagaga ccagcactga gtacaacatc accatcgccg 1440 
tcactgactt ggggacaccc aggctgaaaa cccagcagaa cataaccgtg caggtctccg 1500 
acgtcaatga caacgccccc gccttcaccc aaacctccta caccctgttc gtccgcgaga 1560 
acaacagccc cgccctgcac atcggcagtg tcagcgccac agacagagac tcgggcacca 1620 
acgcccaggt cacctactcg ctgctgccgc cccaggaccc gcacctgccc ctcgcctccc 1680 
tggtctccat caacgcagac aacggccacc tgttcgccct caggtcgctg gactacgagg 1740 
ccctgcaggc gttcgagttc cgcgtgggcg cctcagaccg cggttctccg gctttgagca 1800 
gcgaggcgct ggtgcgcgtg ctggtgctgg acaccaacga caactcgccc ttcgtgctgt 1860 
acccgctgca gaatggctcc gcgccctgca ccgagctggt gccccgggcg gccgagccgg 1920 
gctacctggt gaccaaggtg gtggcggtgg acggcgactc gggccagaac gcctggctgt 1980 
cgtaccagct gctcaaggcc acggagcctg ggctgttcgg cgtgtgggcg cacaatggcg 2040 
aggtgcgcac cgccaggctg ctgagcgagc gcgacgcagc caagcacagg ctcgtggtgc 2100 
ttgtcaagga caatggcgag cctccgcgct cggccaccgc cacgctgcac gtgctcctgg 2160 
tggatggctt ctcccagccc tacctgcctc tccctgaggc ggccccggcc caggcccagg 2220 
ccgactctct caccgtctac ctggtggtgg cgttggcctc ggtgtcgtcg ctcttcctct 2280 
tctcggtgct cctgttcgtg gcggtgcggc tgtgcaggag gagcagggcg gcctcggtgg 2340 
gtcgctgctc ggtgcccgag ggcccctttc cagggcatct ggtggacgta agcggcaccg 2400 
ggaccctgtc ccaagagcta ccagtacgag gtgtgtctga caggagactc tgggactggt 2460 
gagttcaagt tcctgaagcc aatatttcct aatctcttgg ttcaggacac cggggaggga 2520 
agttaaggaa aacccccaag ttcagaaata gcttggtatt cagttaagta ttgtatttag 2580 
ttcagtgaac cgcccgttaa gttttgtcaa acttcccact ggcaatgcct ttatttaaaa 2640 
aaattgtcta cttatctgaa atattcatac cacaatttca aacctactca tgtccctgat 2700 
aaagctaaat ttgtcccttt tttattgtta ttaattgcac ttaacatttt tagttatact 2760 
ggatattgag tatggatttt ctctatattt gatctattgg tgattaatct ttttgtaatc 2820 
ataaattact caattaggat aaaaataaat tatgttttaa tgaaattctt aaattaacat 2880 
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ctttttaatg gaacatttaa gtgaatatat gaatattgaa tttctaaata tttgttgtgc 2940 

ctgtctttac catgtaactt aatgtttgca aggccagagt gtttgaaagt tttgtattta 3000 

actttataat taccttgtcc tttctggttg actatactag gctaagccct cttaatagcc 3060 

atgagtgtaa aatttagttt actcattttt cacaaattgt aaattaacat ggcacttcac 3120 

tacattggta atacactaaa attgtggtcc ttttcctctt gtgaccacca catgtctagt 3180 

gattattttg tttatttggt tgctacttac ctagcacatt gtaatgttcc atgaatgcta 3240 

atattaaatt ttgtaaaaat aacttattta taaataattt ttaaagagaa aaatctcata 3300 

taatttgtca taacctttca ataaataaaa ctgttaaatc atgggcctga tatcatctta 33 60 

aaaaaaaatc ctcagaatct gaaataagcc ctaaatttct ccccaaaatc aagactcttg 3420 

agagcatcat aggtctcctt gtgctacctt ttactcccta taaatagaaa tccaagtata 3480 

ctttaatatg tgtatatttt ttggttttcc tacagcttct ccccatcttt caaaagaatc 3540 

acgaaatttc ttctgcacct tggctattct gtttaaatct gataatcagt tgatctcagg 3 600 

tttttcactg tacattactt tgcagatatg gacagccttt acaaaaataa tttttaaatg 3660 

cttaattatt ttaatttgtt ctttaaggta accttcagtt attttgtatt aatttaactt 3720 

ctcaattatg ccaaagttgc acttgcatga aataaatatt attttgtcct tgtatagact 3780 

ggaacagtaa taaatttatc tgaatt 3806 

<210> 42 

<211> 623 0 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
- <223> Incyte ID No: LI : 1165828 . 1 : 2000MAY01 

<220> 

<221> unsure 

<222> 4042, 4725 

<223> a, t, c, g, or other 

<400> 42 

ctcgcttttc ttgcaatatt ttataccttt tcaattcata gaattactca agaaaactac 60 
ctcagttggt tgctactttt tgttgattcc ttttaccaga catgactaag tttctttttc 120 
atcagtagat ttctgggctc ctatattcac tagagattgc aactcctgga tttctcttac 180 
actagaatcc tatttcgagc catatgggag attc tgaatt ccagaacaaa agaattttgt 240 
aatttaaaat tcgtgattgc tcaatggaat cattttaatt gt tact teat ttctgtcgtt 300 
atttaaaact taagtggaga gttttctcag ggataagaaa accacaatca aggtcataca 3 60 
aaacttttag aggcagtcag tetgetaaga aggctccagc aagagaaacg ggatcttctg 420 
tttcaacaat cattacttaa gaaaaaatta agaaaatgaa ataagttttg cagaataact 480 
gtgaaatttt tattcatgaa atatgtactt acactttggg ccacgtgatg tcactctttg 540 
ccgcgatgtt ctctctgaat ccagacaaat acagcccttt tcccatggga aagaggctca 600 
attctttttc actctctctg tgetgaaega tggegaacac agcagaatgg gactgacgaa 660 
atcagatgat ttcttctaat ttggaggcaa ttttcactaa ttagaagaag actgagtatt 720 
tgaaatgtta tactcaagtc gaggagatcc agagggtcag cctctactgc tctcgcttct 780 
gatcctcgca atgtgggtgg tggggagegg ccagctccac tactccgtcc eggaggaage 840 
cgaacacggc accttcgtgg gccgcatcgc gcaggacctg gggctggagc tggeggaget 900 
ggtgccgcgc ctgttccagt tggattccaa aggecgeggg gaccttctgg aggtaaatcf 960 
gcagaatggc attttgtttg tgaattctcg gatcgaccgc gaggagctgt gegggeggag 1020 
cgcggagtgc agcatccacc tggaggtgat cgtagacagg ccgctgcagg ttttccatgt 1080 
ggacgtggag gtgaaggaca ttaacgacaa ccctccagtg ttcccagcga cacaaaagaa 1140 
tctgttcatc geggaatcca ggccgcttga ctctcggttt ccactagagg gcgcgtccga' 1200 
tgeagatate ggggagaacg ccctgctcac ttacagactg agccccaatg agtatttctt 1260 
cctggacgtg ccaaccagca accagcaggt aaaacctctt ggacttgtat tacggaaact 1320 
tttagacaga gaagaaactc eggagcttea tttattgetc acggccaccg atggaggcaa 13 80 
acccgagctg actggcaccg ttcaattact catcaeggta ctggacaaca atgacaatgc 1440 
cccagtgttc gacagaaccc tgtatacggt gaaattacca gaaaacgttt etateggaac 1500 
gctggtgatt caccccaatg cctcagattt agacgaaggc ttgaatgggg atattattta 1560 
ctccttctcc agtgatgttt ctccagatat aaaatccaag ttccacatgg accccttaag 1620 
tggggcaatc acagtgatag gacatatgga ttttgaagaa agtagagcac acaagatccc 1680 
agtcgaggct gtcgataaag gcttcccacc cctggctggt cattgtacac ttcttgtgga 1740 
agttgtggat gtaaatgaca atgctccaca gttgactatc aaaaegctet cggttcctgt 1800 
aaaagaggac gcacaactgg ggacagttat tgecctgatt agtgtgatcg acctagacgc 1860 
agatgecaac gggcaggtga cctgctccct gacgccccac gtccccttca agctggtgtc 1920 
cacctacaag aattactact cgttggtgct ggacagagct ctggaccgcg agagtgtgtc 1980 
cgcctacgag ctggtggtta ccgcgcggga egggggcteg ccttcactgt gggccacggc 2040 
cagggtgtct gtggaggtgg ccgacgtgaa cgacaacgca ccagcgttcg cgcagtccga 2100 
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gtacacggtg ttcgtgaagg agaacaaccc 
gcgggacgct gacgcgcagg agaacgccct 
gggcgagcgc tcgctgtcga gctacgtgtc 
gctgcagccg ttggaccacg aggagctgga 
cgcgggcgtg ccgcctctgg gcagcaacgt 
cgacaatgcg ccggcgctgc tgacacctcg 
gatggtgctg cggtcggtgg gcgccggcgt 
cgactcgggc tacaacgcgt ggctttcata 
catcccgttc cgcgtggggc tgtacacggg 
aacggacgca ccgcgccagc gcctactggt 
gacggccacg gccactgtgc tggtgtcgct 
gtcgcgggcg tcagtgggtg ccacgggccc 
cctgatcatc gccatctgcg cggtgtctag 
tgtgctgcgg tgctcggcga tgcccaccga 
tggtgtgttc tagcgcggtg gggagttggt 
tgctctggcg agggtaagca gaagaccgac 
ttgtgctggg atctacagag cgaacgggag 
agccacgaca gcccaaccct gactggcgtt 
gctctgtgca cctagaggag gctggcattc 
cagtggccaa ccagtatcca gtgcaacacc 
cccagtcggt gcgggtgtca acagcaacag 
ccaaacaatc cgagtcccgg tgagttgccc 
aatcatctcc atccggcagg agcctactta 
cccttcggca aaaaggagga gacccagaaa 
caggagaaaa aagagaaagg gaacagcacg 
caaatgggaa acaagccact tagccagttt 
aggcaattgc cctgctcctt gtttcctatc 
taatctgcag ataagttccc tggtgtctgt 
taaaaagctt tactaagtct ggttgttaac 
aacctataaa gagcagaccc agagtgtgtc 
cttcccagcc ccgccagtga gaggtgtgga 
ttcgatgaca cttgcatggg caggctgaaa 
tggcccaccc tggggggtta anttgttgct 
gagtgccaga tattggctga gaccgagcca 
aagataacac aacgacaaat aaaacagcgg 
ctctaaaggc gaccagacct ttcatagaat 
tcttaggaga caaaacgtac tcgcccacca 
ttggctagtg acgtcattat acctaaaatc 
ttcaggtgtt taaacagaga atacaccgct 
ctatacatgt gcatgtgctc actttattaa 
aagggccaga tcctttttcc aatacttatg 
cacctttccg ctgtttgttg tttcactagg 
tataaagtct ttaaagagaa atatgaatac 
ttcaggtcta caacgggcgc agtttaaaat 
ctcttataga gaattgcctg aaacatctgt 
tttactcttt caggtcatct ctggggctgc 
tctctctttc tctcgtctcg tctctctttt 
ccaccaaccc ttctctaacc caacctatat 
tggttgttct catacaggtg gagcagattt 
tctgggtggt gctagccata caccttcgtc 
cctcctgaat tgtctaattc ttaactaacc 
ctcccacata tgtatggctg ttatggctat 
ctcgtgcttg tgtaatagtg aaaggtaata 
tttattgttg aaagtgaata tcccttataa 
atggagtgag tggttttttt aacccttggt 
gtaaaatttc ttttttaaat ccagatactg 
cccagagcca tctcgtgccc agacttctgc 
ttcttagtaa caattttgga atgaatactg 
aattttacca atctgacctc ttgtgaagtt 
tcctgaaata tcagctcata ggaaagtacc 
ttaattttgg ttataatgta caatttagaa 
attaatataa aagaggtagg agtctgttat 
ctgtctgtgt ctacttttag cttcattctc 
agctctgcag gattgccatg gggtaaaact 
cattgtaggt tgtgatcatt ttggccccac 
gcccttttga actaggagaa tcgggctaat 
tacagcactt tttacatttg cgaagtgcct 



gccgggctgc cacatcttca cggtgtctgc 2160 
ggtgtcctac tcgctggtgg agcggcggtt 2220 
agtgcacgcg gagagcggca aggtgtacgc 2280 
gctgctacag ttccaggtga gcgcgcgcga 2340 
gacgctgcag gtgttcgtgc tggacgagaa 2400 
gatgaggggc actgacggcg cagtgagcga 2460 
agtggtgggg aaggtgcgcg cagtggacgc 2520 
cgagctgcag ccagaaacgg ccagcgcgag 2580 
cgagatcagc acaacgcgtg ccctggacga 2640 
gctggtgaaa gaccacgggg agccagcgct 2700 
ggtggagagc ggccaggcgc caaagtcatc 2760 
cgaggtgacg ctggtggatg tcaacgtgta 2820 
cctgttggtt ctcacgctgc tgctgtacac 2880 
gggcgagtgc gcgcctggca aggccgacgc 2940 
cgtactcgca gcagagggag gcagagggtg 3000 
ctcatggcct tcagcccggg cctttcttcc 3060 
aaccctctgc ttcctcagat tcaactggga 3120 
actctgcctc cctgagagca ggcatgcaca 3180 
tacgggctgg tccaggaggg gcctgatcag 3240 
cagaacccag aggcaggaga agtgtcccct 3300 
cgtggacctt taaatacgga ccaggcaacc 3360 
gacaaattca ttatcccagg atctcctggc 3420 
cagccacaat tgacaaaagt gacttcataa 3480 
aagaagaaaa agaagaaggg ttaccagacc 3540 
acgtgacaac cagtgaccac gtgaggtcct 3600 
tttgtaataa tgggcaaatc tctcccatgt 3660 
tacattgagc cctcttagag acccgtcaga 3720 
gctagaacgg catttaacac gtttttgtcg 3780 
tctttctctc cactctggct gtgttttcag 3840 
ctgttgctcc tccggccgca ataggagagg 3900 
ctctctgccc tgtgctccgg ggatcctgtc 3960 
agttttgaga ttgagcagct tgggagtttg 4020 
tttgggctaa ccccggcggg ggtaattgcc 4080 
gcttagacta attgggtaca agggaaaggc 4140 
aagttatcag tatggagggg aaagtgtaaa 4200 
ccttacaact caagaggtgg cagccacctc 4260 
acaagactat taggagacca ctaaaatctg 4320 
tggcattcat :tacctggcaa ggccaaacag 43 80 
gggaaacaga agcagatctg atgtgattcg 4440 
aaattctttt gcacacaatt gtttatggaa 4500 
gcaaaagcaa aagaaaaccc cggacacctt 4560 
atttatttaa aaaaagagaa agtctatagc 4620 
aattccccta aactctgcct caaaagagaa 4680 
ttggactcac ttggnctgct acacgaagtg 4740 
attatatcgg ccaccctgcc caatcacagc 4800 
cctcttgaca tgtattacta aataaaatga 4860 
ctaagaaacc aattatgtgc acctttgata 4920 
atccagaccc caaaaattga agaaaaatat 4980 
ctgcaatcta cttaattctg gtggacttgg 5040 
gtttggttta gttttccctt tctaaaacca 5100 
accctatgaa tgttaccccg agaatcccat 5160 
gcttagactc cctggaataa taacttactt 5220 
gccactatta cctcagagtg aactttaagc 5280 
tattcccttt gtgacaacct cgtggaaaaa 5340 
aatacagact tttgtgtatg aaagacccca 5400 
gtgattcaag gaattttatt tatggtccag 5460 
tggcaagggg agtggataaa gctgttttgg 5520 
acaatattcc atgaagggtg tgcaagcaca 5580 
gcagtaatgc tttgaaattt ctaatgggta 5640 
aaaatttgct gtcaccttaa ataagacatt 5700 
agtttgatta attatattat ctatttaggc 5760 
ttaaaaaaag ccatttaatt taaaaaaaaa 5820 
ccatattttg gaagggtgtg taaactttca 5880 
tgttacccaa cacatgtgaa ccatttgcta 5940 
tgaagcccca tgtatcctga cccttaacgt 6000 
ttattaatga tgataattat aatgtatctg 6060 
ttccaatcca tgttagttac tagttattac 6120 
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cagctgtaaa ggagttaaac acctcaagtg gaatcatttt gaaattggtg ctaattggta 6180 
tttcctcctg ttatctgcta ataaatgaaa aatggtggta tgaaaaaaaa 6230 

<210> 43 

<211> 2940 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 007302 . 1 :2000MAY01 
<400> 43 

aagaatttgg actcatatca agatgctctg aagaagaaca accctttagg atagccactg 60 
caacatcatg accaaagaca aagaacctat tgttaaaagc ttccattttg tttgccttat 120 
gatcataata gttggaacca gaatccagtt ctccgacgga aatgaatttg cagtagacaa 180 
gtcaaaaaga ggtcttattc atgttccaaa agacctaccg ctgaaaacca aagtcttaga 240 
tatgtctcag aactacatcg ctgagcttca ggtctctgac atgagctttc tatcagagtt 3 00 
gacagttttg agactttccc ataacagaat ccagctactt gatttaagtg ttttcaagtt 3 60 
caaccaggat ttagaatatt tggatttatc tcataatcag ttgcaaaaga tatcctgcca 420 
tcctattgtg agtttcaggc atttagatct ctcattcaat gatttcaagg ccctgcccat 48 0 
ctgtaaggaa tttggcaact tatcacaact gaatttcttg ggattgagtg ctatgaagct 540 
gcaaaaatta gatttgctgc caattgctca cttgcatcta agttatatcc ttctggattt 600 
aagaaattat tatataaaag aaaatgagac agaaagtcta caaattctga atgcaaaaac 660 
ccttcacctt gtttttcacc caactagttt attcgctatc caagtgaaca tatcagttaa 720 
tactttaggg tgcttacaac tgactaatat taaattgaat gatgacaact gtcaagtttt 780 
cattaaattt ttatcagaac tcaccagagg tccaacctta ctgaatttta ccctcaacca 840 
catagaaacg acttggaaat gcctggtcag agtctttcaa tttctttggc ccaaacctgt 900 
ggaatatctc aatatttaca atttaacaat aattgaaagc attcgtgaag aagattttac 9 60 
ttattctaaa acgacattga aagcattgac aatagaacat atcacgaacc aagtttttct 1020 
gttttcacag acagctttgt acaccgtgtt ttctgagatg aacattatga tgttaaccat 1080 
ttcagataca ccttttatac acatgctgtg tcctcatgca ccaagcacat tcaagttttt 1140 
gaactttacc cagaacgttt tcacagatag tatttttgaa aaatgttcca cgttagttaa 1200 
attggagaca cttatcttac aaaagaatgg attaaaagac cttttcaaag taggtctcat 1260 
gacgaaggat atgccttctt tggaaatact ggatgttagc tggaattctt tggaatctgg 1320 
tagacataaa gaaaactgca cttgggttga gagtatagtg gtgttaaatt tgtcttcaaa 1380 
tatgcttact gactctgttt tcagatgttt acctcccagg atcaaggtac ttgatcttca 1440 
cagcaataaa ataaagagcg ttcctaaaca agtcgtaaaa ctggaagctt tgcaagaact 1500 
caatgttgct ttcaattctt taactgacct tcctggatgt ggcagcttta gcagcctttc 1560 
tgtattgatc attgatcaca attcagtttc ccacccatcg gctgatttct tccagagctg 1620 
ccagaagafcg aggtcaataa aagcagggga caatccattc caatgtacct gtgagctaag 1680 
agaatttgtc aaaaatatag accaagtatc aagtgaagtg ttagagggct ggcctgattc 1740 
ttataagtgt gactacccag aaagttatag aggaagccca ctaaaggact ttcacatgtc 1800 
tgaattatcc tgcaacataa ctctgctgat cgtcaccatc ggtgccacca tgctggtgtt 1860 
ggctgtgact gtgacctccc tctgcatcta cttggatctg ccctggtatc tcaggatggt 1920 
gtgccagtgg acccagactc ggcgcagggc caggaacata cccttagaag aactccaaag 1980 
aaacctccag tttcatgctt ttatttcata tagtgaacat gattctgcct gggtgaaaag 2040 
tgaattggta ccttacctag aaaaagaaga tatacagatt tgtcttcatg agaggaactt 2100 
tgtccctggc aagagcattg tggaaaatat catcaactgc attgagaaga gttacaagtc 2160 
catctttgtt ttgtctccca actttgtcca gagtgagtgg tgccattacg aactctattt 2220 
tgcccatcac aatctctttc atgaaggatc taataactta atcctcatct tactggaacc 2280 
cattccacag aacagcattc ccaacaagta ccacaagctg aaggctctca tgacgcagcg 2340 
gacttatttg cagtggccca aggagaaaag caaacgtggg gctcttttgg gctaacatta 2400 
gagccgcttt taatatgaaa ttaacactag tcactgaaaa caatgatgtg aaatcttaaa 2460 
aaaatttagg aaattcaact taagaaacca ttatttactt ggatgatggt gaatagtaca 2520 
gtcgtaagta actgtctgga ggtgcctcca ttatcctcat gccttcagga aagacttaac 2580 
aaaaacaatg tttcatctgg ggaactgagc taggcggtga ggttagcctg ccagttagag 2640 
acagcccagt ctcttctggt ttaatcatta tgtttcaaat tggaaacagt ctcttttgag 2700 
taaatgctca gtttttcagc tcctctccac tctgctttcc caaatggatt ctgttgtgag 2760 
caagagttta tatggcttca tggcagcaag ggaacagtca acttcagcat catatgcacc 2820 
agtcctcgga gtgccctgtg aatcatattg gtctttgggt cagtgtcatc attctcttca 2880 
agtctggggc ttggggaaaa aattagatca gctacggcat ataaaaaagt cttttgtttc 2940 

<210> 44 

<211> 4438 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> misc_feature 

<223> Incyte ID No: LI :23 6386 . 4 :2000MAY01 
<220> 

<221> unsure 
<222> 3774 

<223> a, t, c, g, or other 
<400> 44 

taagcctcag tccttgtttt cccggcctgg ctcgttgtga agccggacac atccaccctt 60 
ggactcgatt caggcggctg ctgcttttct ccttgcccct cttggatttt ccggattttt 120 
gaaaacccag tggcccagga gcaagaggag gaaggaggaa ggggcagatc tgcagaggaa 180 
tgtgagagcc tcccaaagcg agagccgcca aaagaatctg ggagccagag ggacatccga 240 
gccctgcccg ggtttctgga atggtggttt cagagtgagt ctcttctatt ttagaacgtt 3 00 
gttccagtgg aaagtgtcga atttttcccc tcgcagggca gatttctcca ggtcacttga 3 60 
cttttcttct gggagtagga gttaggagag attcccctct aaccccccag aggctgctaa 420 
gggaggagga gactgtggac atgagccctc cctgctcaca agcatatgcc cggagacctg 480 
atagggcagt ttctgggcca tggacattgc tttgaagagg gggagactgg acagcatctg 540 
tgggtgctga gaccccacct taggacctga gagattgaac tgtgtaagcg ccattcagct 600 
gcgagtgcat tcttggactg ccttgtgagc atccccggtc tgggcaggac cctctccttc 660 
ccatctttct ataccaccca gcccagccat ggcactgaaa ggccgagccc tctatgactt 720 
tcacagtgag aacaaggagg aaatcagcat ccagcaggat gaggacctgg tcatctttaa 780 
cgagaactca cttggattgg ttggcttgca gggccaaaac agccgtgggg agacagggct 840 
ctttcctgcc tcttatgtgg agatcgtccg ttctggcatc agcaccaacc atgctgacta 900 
ctccagcagc cctgcaggct ctcccggagc ccaggtgagc ttgtacaaca gccccagtgt 960 
ggccagccca gctaggagtg gtgggggcag tggcttcctc tcaaaaccag ggtagctttg 1020 
aggaggatga tgatgatgac tgggatgact ggtgacgacg gatgcacagt ggtggaggag 1080 
ccacagggct ggtggggctg gcgcacacaa cggggcaacc cgtcccctca accgtgtcct 1140 
agcatggggc cctaccccca gcccagcaca atgcccttcc ggcccaagcc aacaatgtga 1200 
ggcggcagga cagcctggca tctgccaagg cgaggcagtg tggtgggcca gtaacactca 12 60 
accgtttctc atgctttgtg cgttctggaa tggaagccct taatcctggg tgatgtgccc 1320 
atgatggcac aagatcgctg agacatactc cattgaaatg ggccctcgtg gcccccagtg 1380 
tgaaggcgca atccccaccc atttgcctgc tctgtggagg accccacaaa acagaccaaa 1440 
ttcaagggca tcaaaagcta catctcctac aagctcacac ccacccatgc tgcctcaccc 1500 
gtctaccggc gctacaaaca ctttgactgg ctctataacc cgcctgctac acaagttcac 1560 
tgtcatctcg gtgccccacc tgcctgagaa gcaggccact ggccgcttcg aggaggactt 1620 
catcgaaaag cggaagcgga gactcatcct ctggatggac cacatgacca gccaccctgt 1680 
gctctcccag tacgaaggct tccagcattt cctcagctgc ctggatgaca agcagtggaa 1740 
gatgggcaaa cgccgggcgg agaaggatga gatggtgggt gccagcttcc tgctcacctt 1800 
ccagatcccc accgagcacc aggacttgca ggacgtggaa gatcgcgtgg acactttcaa 1860 
ggccttcagt aagaagatgg acgacagcgt cctgcagctc agcactgtgg catcagagct 1920 
ggtgcgtaaa catgtggggg gcttcccgca aggaattcca gaacgctggg cagtgccttc 1980 
caggccatca gtcattcctt ccagatggac cccccctttt gctctgaggc cctcaacagt 2040 
gccatttctc acacgggccg tacctatgaa gccatcgggg agatgtttgc tgagcagccc 2100 
aagaatgacc tcttccagat gctggacaca ctgtctctct accagggcct gctctccaac 2160 
ttccctgaca tcatccatct acaaaaaggc gccttcgcca aggtgaagga gagccaacgc 2220 
atgagtgacg agggccgcat ggtgcaggac gaggcagacg gcattcgcag gcgctgccgc 2280 
gtggtgggtt tcgccctgca ggccgagatg aaccacttcc accagcgccg tgagctcgac 2340 
ttcaagcatc atgatgcaga actacttgcg ccagcagatc ctcttctacc agcgggtggg 2400 
ccagcagctg cataagaccc tgcgcatgta tcacaccctc tgaccgcgtg tgcctgggct 2460 
ccctccttca cctgggcctg gtcactgcag tgtactccac tttcacgacc accctatgcc 2520 
agcagtgact gatgaattgg tcagcggtgg cggagataac cggcctgtcc tgcctcctgg 2580 
tagaaggagc tttcaaggag tcatgggtgc ccctgggaaa ttccccactc cttagaagtg 2640 
gggcacagca ggggtgagaa tagagtcagg agccctcgag gccaaggcct gggctgccgg 2700 
tcagtccagt gaaggtcagg ccagggtctc agcctcccct agagcctatt ttgcttgctc 2760 
acctggcgca ctgtgtgcct tatccattca gcagacaccg aggcctgctg cacccttggg 2820 
tcggatgctg ggcaccccag ggctgtgaca tgcctgcctc ttcaggagtc ctcaagtgaa 2880 
ggtcggggtc agacacagac agagtcaact gcagtactga ctgactgctt taaatgacgg 2940 
gatttttgga agctctatag aagggaccac agctattcca ctggtcaggg tagactccat 3 000 
agagtaggct acatttgggg cagtgttttg aagaatctag caaggaccta ggcccagaca 3060 
gtacatgcgg gacgaagaga cttctaccgg gagaggaaca gcatgaggcc aaagttatgg 3120 
agggcttgca aacttctccc tcttctctcc ccttactttc caaggcaagt taggtgacgc 3180 
tttccatggg gattctcggc ctgtgtggta aggaacgagg atctcccttg ctccccatgt 3240 
agctggtctg tccgtgacat caccctgtcc cctgcaggag ggggctacag gccatctccc 33 00 
ttcctgtagg cctctgactc ccctcccact tttggggccc tcagcttatc tcgcgcatgg 3360 
ggaccattcg cagcatcctc gccctcctgc ggactcaaga tccatgagat ataagccctg 3420 
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ggccccagat ccctggtgac accttccttg gagaagactc tcaaaagtga ctgtatattt 3480 
gagttcacca gcaataactc cccacacttc gaagcatggt ccaaacccat ggatcccagg 3540 
gtccttgggc ctctgtgggc actgtcttcc caagatcctt cctgttgcaa caatgggaaa 3600 
ccttaagagg aaaaagacag gggcctgctt tgcccagccc atgcgaaggg attccatgcc 3660 
cacctgccct ctgcctgcct cgctggaatg tgggcccctg ctcccccgtc agggtggtgc 3720 
tgtctctgac ctatgtttac gatccccgag gggtttttgg cttccccttc ccanccaggt 3780 
cagggtgtgg ttccagcagc ttgctgtggg gtgctgacat gtgtcaccac tgcccccctt 3840 
gtcccccggg ggggtcatgg tctcctcctg gatgctgctc cttgaatctt ttttcttgat 3900 
aaacctttta caattaagat aacacaagca tgactttttc tgtttggatc ccagaaaggc 3960 
ggagggcagg agaaggatag agccctaatt gctcctgaga gccattggat gagattctga 4020 
ggtcgtggtg ggcacaaatt ttccacagaa cctcaaaagt tcaggggagg gctatgctgg 4080 
tggaaggtgc cagcaggcag gaggagctag aggcggctgt ggacccctgg gtggatccat 4140 
ccctccctag aacgcactct tgtctctaaa acaggtggag tgctgcccag gggactggct 4200 
gtactgcctt gtgatctggg gctgagggtt gtatgaggaa gggacaggac gctgtgccct 4260 
aggacaatta atagatggtg gctcctctcc ccaaggagcc atgccctggc cttgcccttg 4320 
aaaagcccta gtccagggga gggaagtggg ggactcagaa gctgtgtctc ttccccaaac 4380 
cgtcctgggt acccagccct gcggaggtcc cacattggaa ctgaagagga cgctggct 4438 

<210> 45 

<211> 987 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 252904 . 5 : 2000MAY01 
<400> 45 

cccgacttca gccccagcca gatcccgcgt caacggaggc ggaacggcgg accccgtacc 60 

ctggcagcat cggagcaccg gcgggtgaag gcaaggtccc tggactggtc atatacctct 120 

tgtggccctg gcagaatcaa gatgaggccc tgtcatgcct ccccagtgag gcctacagtc 180 

tgagcagaca gcatggcctg ccactggcag tgaacaccat gtctgcagga ggtggccggg 240 

cctttgcttg atggtatggt gtatgctctg gggggaatgg gccctgacac ggccccccag 300 

gcccaggtac gtgtgtatga gccccgtcgg gactgctggc tttcgctacc ctccatgccc 360 

acaccctgct atggggcctc caccttcctg cacgggaaca agatctatgt cctggggggc 420 

cgccagggca agctcccggt gactgctttt gaagcctttg atctggaggc ccgtacatgg 480 

acccggcatc caagcctacc cagccgtcgg gcctttgctg gctgcgccat ggctgaaggc 540 

agcgtcttta gcctgggtgg cctgcagcag cctgggcccc acaacttcta ctctcgccca 600 

cactttgtca acactgtgga gatgtttgac ctggagcatg ggtcctggac caaattgccc 660 

cgcagcctgc gcatgaggga taagagggca gactttgtgg ttgggtccct tgggggccac 720 

attgtggcca ttgggggcct tggaaaccag ccatgtcctt tgggctctgt ggagagcttt 780 

agccttgcac ggcggcgctg ggaggcattg cctgccatgc ccactgcccg ctgctcctgc 840 

tctagtctgc aggctgggcc ccggctgttt gttattgggg gtgtggccca gggccccagt 900 

caagccgtgg aggcactgtg tctgcgtgat ggggtctgaa ggcttggtgg agctgtccac 960 

tgagcagctc attggggatc cactagt 987 

<210> 46 
<211> 263 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG:977683 . l.orf3 :2000FEB18 
<400> 46 

Gly Ser Asp Met Ala Ala Asp Leu Asn Leu Glu Trp lie Ser Leu 
15 10 15 

Pro Arg Ser Trp Thr Tyr Gly lie Thr Arg Gly Gly Arg Val Phe 

20 25 30 

Phe He Asn Glu Glu Ala Lys Ser Thr Thr Trp Leu His Pro Val 

35 40 45 

Thr Gly Glu Ala Val Val Thr Gly His Arg Arg Gin Ser Thr Asp 

50 55 60 

Leu Pro Thr Gly Trp Glu Glu Ala Tyr Thr Phe Glu Gly Ala Arg 

65 70 75 

Tyr Tyr He Asn His Asn Glu Arg Lys Val Thr Cys Lys His Pro 
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80 










85 






90 


Val 


Thr 


Gly Gin 


Pro 
95 


Ser 


Gin 


Asp 


Asn 


Cys 
100 


He 


Phe 


Val Val Asn 
105 


Glu 


Gin 


Thr Val 


Ala 


Thr 


Met 


Thr 


Ser 


Glu 


Glu 


Lys 


Lys Glu Arg 








110 










115 




120 


Pro 


Tl ^. 


Ser Met 


lie 


Asn 


Glu 


Ala 


Ser 


Asn 


Tyr 


Asn 


Val Thr Ser 








125 










130 




135 


Asp 


Tyr 


Ala val 


His 


Pro 


Met 


ser 


Pro 


Val 


Gly 


Arg 


Thr Ser Arg 








140 










145 


150 


Ala 


ser 


Lys Lys 


val 
155 


HIS 


Asn 


Phe 


Gly 


Lys 
160 


Arg 


Ser 


Asn Ser He 
165 


Lys 


Arg 


Asn Pro 


Asn 
170 


Ala 


Pro 


TT— T 

Val 


Val 


Arg 
175 


Arg 


Gly 


Trp Leu Tyr 
180 


Lys 


Gin 


Asp Ser 


Thr 
185 


Gly 


Met 


Lys 


Leu 


Trp 
190 


Lys 


Lys 


Arg Trp Phe 
195 


val 


Leu 


Ser Asp 


Leu 
200 


Cys 


Leu 


Phe 


Tyr 


Tyr 
205 


Arg 


Asp 


Glu Lys Glu 
210 


Glu 


Gly 


He Leu 


Gly 
215 


Ser 


He 


Leu 


Leu 


Pro 
220 


Ser 


Phe 


Gin He Ser 
225 


Phe 


Ala 


Tyr Pro 


Leu 
230 


Lys 


He 


Thr 


Leu 


He 
235 


Ala 


Asn 


Met Leu Leu 
240 


Arg 


Gin 


Pro He 


Gin 
245 


Thr 


Cys 


Gly 


Pro 


He 
250 


He 


Ser 


Ala Leu He 
255 


Gin 


Glu 


Arg Lys 


Trp 
260 


Ser 


Cys 


Gly 













<210> 47 

<211> 217 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> iaisc_feature 

<223> Incyte ID No: LG: 893050 . 1 .orfl : 2000FEB18 
<400> 47 



Ser 


Leu 


Pro 


Ser 


Thr 


Ser 


Phe Arg Val 


Ser 


Ser 


Leu 


Phe Ser Gly 


1 








5 






10 






15 


His 


Leu 


Glu 


Val 


Leu 


Lys 


Leu Leu Val 


Ala 


Arg 


Gly 


Ala Asp Leu 










20 






25 






30 


Gly 


Cys 


Lys 


Ala 


Arg 


Lys 


Gly Tyr Gly 


Leu 


Leu 


His 


Thr Ala Ala 










35 






40 






45 


Ala 


Ser 


Gly 


Gin 


He 


GlU 


Val Val Lys 


Tyr 


Leu 


Leu 


Arg Met Gly 










50 






55 






60 


Ala 


Glu 


He 


Asp 


Glu 


Pro 


Asn Ala Phe 


Gly 


Asn 


Thr 


Ala Leu His 










65 






70 






75 


He 


Ala 


Cys 


Tyr 


Leu 


Gly 


Gin Asp Ala 


Val 


Ala 


He 


Glu Leu Val 










80 






85 






90 


Asri 


Ala 


Gly 


Ala 


Asn 


Val 


Asn Gin Pro 


Asn 


Asp 


Lys 


Gly Phe Thr 










95 






100 




105 


Pro 


Leu 


His 


Val 


Ala 


Ala 


Val Ser Thr 


Asn 


Gly 


Ala 


Leu Cys Leu 










110 






115 






120 


Glu 


Leu 


Leu 


Val 


Asn 


Asn 


Gly Ala Asp 


Val 


Asn 


Tyr 


Gin Ser Lys 










125 






13 0 






135 


Glu 


Gly 


Lys 


Ser 


Pro 


Leu 


His Met Ala 


Ala 


He 


His 


Gly Arg Phe 










140 






145 






150 


Thr 


Arg 


Ser 


Gin 


He 


Leu 


He Gin Asn 


Gly 


Ser 


GlU 


He Asp Cys 










155 






160 






165 


Ala 


Asp 


Lys 


Phe 


Gly 


Asn 


Thr Pro Leu 


His 


Val 


Ala 


Ala Arg Tyr 










170 






175 






180 


Gly 


His 


Glu 


Leu 


Leu 


He 


Ser Thr Leu 


Met 


Thr 


Asn 


Gly Ala Asp 










185 






190 






195 


Thr 


Gly 


Arg 


Arg 


Gly 


He 


His Asp Met 


Phe 


Pro 


Leu 


His Leu Ala 










200 




205 






210 


Val 


Leu 


Phe 


Gly 


Phe 


Ser 


Asp 


















215 
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<210> 48 
<211> 716 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: LG: 980153.1 

<220> 

<221> unsure 
<222> 683 

<223> unknown or other 
<400> 48 



Gin 


Arg 


Gly 


Ala 


Lys 


Thr 


Arg 


Leu 


1 








5 








Cys 


Tyr 


Lys 


Ala 


Ala 


Thr 


He 


Lys 






• 




20 








Leu 


His 


Pro 


Cys 


Phe 


Leu 


Leu 


Val 










35 








Trp 


Leu 


He 


Gin 


Lys 
50 


Gly 


Val 


Asp 


Ser 


Gly 


Trp 


Thr 


Ala 


Leu 


His 


Arg 










65 








Asp 


Cys 


Val 


Trp 


Ser 


Leu 


Leu 


Lys 










80 








Gin 


Asp 


Lys 


Glu 


Gly 


Leu 


Ser 


Ala 










95 








Arg 


Pro 


Thr 


His 


Val 


Val 


Phe 


Lys 










110 






Tyr 


Thr 


Trp 


Gly 


Asp 


Asn 


Thr 


Asn 










125 








Gin 


Asn 


Ser 


Lys 


His 


His 


Pro 


Glu 










140 








Ser 


Gly 


He 


Tyr 


He 


Lys 


Gin 


Val 










155 








Val 


Phe 


Leu 


Ser 


Gin 


Lys 


Gly 


Gin 










170 








Pro 


Gly 


Gly 


Arg 


Leu 


Gly 


His 


Gly 










185 








Pro 


Arg 


Leu 


Val 


Glu 


Gly 


Leu 


Asn 










200 








Ala 


Ala 


Ma 

iiia 


Lys 


Asp 


IT J _ 

HIS 


Tnr 


Val 










215 








Val 


Tyr 


Thr 


Phe 


Gly 


Leu 


Asn 


He 










230 








Pro 


Pro 


Pro 


Ser 


Ser 


Cys 


Asn 


Val 










245 






Tyr 


Leu 


Lys 


Gly 


Arg 


Thr 


He 


He 










260 








His 


Thr 


Val 


Leu 


Trp 


Thr 


Arg 


Glu 










275 






Asn 


Gly 


Gly 


Gin 


Leu 


Gly 


Cys 


Leu 










290 








Cys 


Val 


Thr 


Ala 


Pro 


Arg 


Gin 


Val 










305 






lie 


Ala 


Leu 


Ser 


Leu 


Val 


Ala 


Ala 










320 








Val 


Thr 


Thr 


Arg 


Gly 


Asp 


He 


Tyr 










335 








Lys 


Lys 


Met 


Ala 


Ser 


Lys 


Gin 


Leu 










350 








Ser 


Gly 


Gly 


His 


Met 


Glu 


Tyr 


Lys 










365 






Glu 


Asn 


Gly 


Gly 


Gin 


Lys 


He 


Cys 



380 
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Arg 


Pro 


Phe 


Ser 


Pro Arg His 




10 






15 


Asp 


Val 


Phe 


Gly 


Arg Asn Ala 




25 






30 


Glu 


Lys 


Lys 


Gly 


Val Leu Asp 




40 






45 


Leu 


Leu 


Val 


Lys 


Asp Lys Glu 




55 






60 


Ser 


He 


Phe 


Tyr 


Gly His He 




70 






75 


His 


Gly Val 


Ser 


Leu Tyr He 




85 






90 


Leu 


Asp 


Leu 


Val 


Met Lys Asp 




100 






105 


Asn 


Thr Asp 


Pro 


Thr Asp Val 




115 






120 


Phe 


Thr 


Leu 


Gly 


His Gly Ser 




130 






135 


Leu 


Val 


Asp 


Leu 


Phe Ser Arg 




145 






150 


Val 


Leu 


Cys 


Lys 


Phe His Ser 




160 






165 


Val 


Tyr 


Thr 


Cys 


Gly His Gly 




175 






180 


Asp 


Glu 


Gin 


Thr 


Cys Leu Val 




190 






195 


Gly 


His 


Asn 


Cys 


Ser Gin Val 




205 






210 


Val 


Leu 


Thr 


Glu 


Asp Gly Cys 




220 






225 


Phe 


His 


Gin 


Leu 


Gly He He 




235 






240 


Pro 


Arg 


Gin 


He 


Gin Ala Lys 




250 






255 


Gly 


Val 


Ala 


Ala 


Gly Arg Phe 




265 






270 


Ala 


Val 


Tyr 


Thr 


Met Gly Leu 




280 






285 


Leu 


Asp 


Pro 


Asn 


Gly Glu Lys 




295 






300 


Ser 


Ala 


Leu 


His 


His Lys Asp 




310 






315 


Ser 


Asp Gly 


Ala 


Thr Val Cys 




325 






330 


Leu 


Leu 


Ala 


Asp 


Tyr Gin Cys 




340 






345 


Asn 


Leu 


Lys 


Lys 


Val Leu Val 




355 






360 


Val 


Asp 


Pro 


Glu 


His Leu Lys 




370 






375 


He 


Leu 


Ala 


Met 


Asp Gly Ala 




385 






390 
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biy ax y 


Val 


Phe 


^.yfa 


irp 




Car- TTsal 

oer Val 


Asn 


ber 


Ser Leu Lys 


Gin 








J y D 








/ fin 




405 


tys AX y 


Trp Ala 


lyr 


xriO 




bin Val 


Fne 


lie 


Ser Asp lie 


Ala 








*± X VJ 












420 


T ■ en i H en 


Arg Asn 


bill 


lie 


ijeu 


T>1-»£a Trial 

riie vai 


Thr 


Gin 


Asp Gly Glu 


Gly 






















435 


Jriltr AX y 


Gly Arg 


irp 


rT162 


vslU 


Glu Lys 


Arg 


Lys 


Ser Ser Glu 


Lys 








440 








4 AR 




a c n 


4J o uxu 


lie 


Leu 




Aoli 


— 

lieu 


ill 5 ASU 


oer 


Ser 


ber Asp vai 


Ser 








455 








4 fin 

ft D \J 








Ser 


Asp 


X X t= 


noli 


o tsl 


val lyr 


blU 


Arg 


lie Arg jjeu 


G1U 








470 








475 






4t o U 


T,va TiCan 


Thr 


Phe 


Aid 


nib 


21 vrr 


nla Val 


Ser 


vai 


ber inr Asp 


Fro 








4.AR 














wCl V34.J/ 


Cys 


Asn 




nla 


i ie 


Lieu bin 


Ser 


Asp 


Pro Lys Tnr 


Ser 








JUU 












blO 


TtOll TKr-v* 
UcU ± J/ X 


Glu 


lie 


Pro 


nla 


val 


ber ber 


Ser 


Ser 


Fne Fne Glu 


Glu 








jij 








con 








irllfcr bl_y 


Lys 


Leu 


Leu 


Arg 


bill 


_ 

Ala Asp 


G1U 


Met 


Asp Ser lie 


His 








JJ u 








a c 

jjD 






540 


ion V»1 


Thr 


Phe 


vjin 


val 


pi,/ 

biy 


Asn Arg 


Leu 


Fne 


Pro Ala His 


Lys 








54.5 

J4 J 








Dj U 






ODD 


Ajfl lie 


Leu 


Ala 


Val 


rixs 


ser 


Asp Fne 


Fne 


bin 


Lys Leu Phe 


Leu 








jOU 








OOO 




D fQ 


Cav A on 
del Ao£> 


Gly Asn 


inr 


Ser 


blU 


Fne inr 


Asp 


lie 


Tyr Gin Lys 


Asp 








D / D 








con 
DO\) 




roc 

585 


olU AS£J 


Ser 


Ala 


fT \r 

biy 


Cys 


III c 

rllS 


Leu Phe 


Val 


Val 


Glu Lys Val 


His 








con 












600 




Met 


Phe 


blU 


Tyr 


Lieu 


Leu Gin 


Prie 


lie 


Tyr Thr Asp 


Thr 
















610 




615 


v-yb Asp 


Phe 


Leu 


Thr 


HIS 


biy 


Phe Lys 


Pro 


Arg 


lie His Leu 


Asn 








con 














63 0 


xjy o /ibii 


Pro 


Glu 


bill 


Tyr 


bin 


biy inr 


Leu 


Asn 


Ser His Leu 


Asn 








fi"3 c. 








o<tU 






c a cr 

645 


T.ye Va 1 

uy& val 


Asn 


Phe 


nib 


blU 


Asp 


Asp Asn 


bin 


Liys 


Ser Ala Phe 


Glu 








650 








655 




660 

u u u 


Val Tyr 


Lys 


Ser 


Asn 


Gin 


Ala 


Gin Thr 


Val 


Ser 


Glu Arg Gin 


Lys 








665 








670 




675 


Ser Lys 


Pro 


Lys 


Ser 


Cys 


Lys 


Xaa Gly 


Lys 


Asn 


lie Arg Glu 


Asp 








680 








685 






690 


Asp Pro 


Val 


Arg 


Met 


Leu 


Gin 


Thr Val 


Ala 


Lys 


Lys Phe Asp 


Phe 








695 








700 




705 


Ser Asn 


Leu 


Ser 


Ser 


Arg 


Leu 


Asp Gly 


Val 


Arg 







710 715 

<210> 49 
<211> 107 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: LG:350398 . 1 . orf 3 :2000FEB18 
<220> 

<221> unsure 
<222> 22 

<223> unJcnown or other 
<400> 49 

Glu Pro Leu Ser Pro Pro Gly Arg He Pro Gly Ala Ala Gly Glu 
15 10 15 

Cys Glu Gly Pro Gin Gly Xaa Phe Ala Ser Arg Gin Pro Tyr Ser 

20 25 30 

Arg Phe Leu Leu Arg Tyr Trp His Leu Thr Pro He Thr Pro Trp 

35 40 45 

Ala He Val Pro Val Trp Ser Pro Arg Gly Arg Ser Arg Gly Ser 
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50 55 60 

Pro Asn Ser Thr Ser Gin Thr Ser lie Gin Ala Gly Thr Ser Thr 

65 70 75 

Leu Leu Ala Ser Arg His Gin Asn lie Trp Glu Asp Met Cys Val 

80 85 90 

Ser Thr Cys Met Trp Gly His Thr Gly Gly Asn Met Gly Met Arg 

95 100 105 

Ala Val 



<210> 50 

<211> 645 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: LG : 475551 ♦ 1 .orf 3 : 2000FEB18 
<220> 

<221> unsure 
<222> 141 

<223> unknown or other 
<400> 50 





Pin 

V7j.n 




pin 


Ser 


v^iy 


Aia 


Asp 




Asp 


Lys 


Arg 


vai Jjys Lys 


1 
X 








r 
O 










10 






15 


Leu 


Pro 


Leu 


Met. 


Ala 

AXa 


Leu 


Ser 


Thr 


Thr 


Met 


Ala 


G1U 


Ser Phe Lys 




















") c 
ZD 






3 0 




Leu 


Asp 


Pro 


Asp 


Ser 
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DO 




0 u 
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noil 


A en 


Tip 
11c 


Jrilfc; 


oin 


AT a 


Cor 


nbll 


riO 


ryr 


bell 


riO ±111 Aiy 










65 










70 




75 


Glv 




Thy 


1111 






OC51 


oiy 




nig 


JrilG 


niy 


P*\ r c; "D -v- <»>. O ay- 

\~>ys irlQ 061 










80 










85 






90 


Cys 


Arg 


His 


Glu 


Val 


Val 


Leu 


Asp 


Arg 


His 


Gly 


Val 


Tyr Gly Leu 










95 










100 






105 


Gin 


Arg 


Asn 


Leu 


Leu 


Val 


Glu 


Asn 


He 


He 


Asp 


He 


Tyr Lys Gin 










110 










115 






120 


Glu 


Cys 


Ser 


Ser 


Arg 


Pro 


Leu 


Gin 


Lys 


Gly 


Ser 


His 


Pro Met Cys 










125 










130 






135 


Lys 


Glu 


His 


Glu 


Asp 


Glu 


Lys 


He 


Asn 


He 


Tyr 


Cys 


Leu Thr Cys 










140 










145 






150 


Glu 


Val 


Pro 


Thr 


Cys 


Ser 


Met 


Cys 


Lys 


Val 


Phe 


Gly 


He His Lys 










155 










160 






165 


Ala 


Cys 


Glu 


Val 


Ala 


Pro 


Leu 


Gin 


Ser 


Val 


Phe 


Gin 


Gly Gin Lys 










170 










175 






180 


Thr 


Glu 


Leu 


Asn 


Asn 


Cys 


He 


Ser 


Met 


Leu 


Val 


Ala 


Gly Asn Asp 










185 










190 






195 


Arg 


Val 


Gin 


Thr 


He 


He 


Thr 


Gin 


Leu 


Glu 


Asp- 


Ser 


Arg Arg Val 










200 










205 






210 


Thr 


Lys 


Glu 


Asn 


Ser 


His 


Gin 


Val 


Lys 


Glu 


Glu 


Leu 


Ser Gin Lys 










215 










220 






225 


Phe 


Asp 


Thr 


Leu 


Tyr 


Ala 


He 


Leu 


Asp 


Glu 


Lys 


Lys 


Ser Glu Leu 










230 










235 






240 


Leu 


Gin 


Arg 


He 


Thr 


Gin 


Glu 


Gin 


Glu 


Lys 


Lys 


Leu 


Ser Phe He 










245 










250 




255 


Glu 


Ala 


Leu 


He 


Gin 


Gin 


Tyr 


Gin 


Glu 


Gin 


Leu 


Asp 


Lys Ser Thr 










260 








265 






270 


Lys 


Leu 


Val 


Glu 


Thr 


Ala 


He 


Gin 


Ser 


Leu 


Asp 


Glu 


Pro Gly Gly 










275 










280 




285 


Ala 


Thr 


Phe 


Leu 


Leu 


Thr 


Ala 


Lys 


Gin 


Leu 


He 


Lys 


Ser He Val 










290 








295 




300 


Glu 


Ala 


Ser 


Lys 


Gly 


Cys 


Gin 


Leu 


Gly 


Lys 


Thr 


Glu 


Gin Gly Phe 










305 










310 






315 


Glu 


Asn 


Met 


Asp 


Phe 


Phe 


Thr 


Leu 


Asp 


Leu 


Glu 


His 


He Ala Asp 
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320 325 330 

Ala Leu Arg Ala lie Asp Phe Gly Thr Asp Glu Glu Glu Glu Glu 

335 340 345 

Phe lie Glu Glu Glu Asp Gin Glu Glu Glu Glu Ser Thr Glu Gly 

350 355 360 
Lys Glu Glu Gly His Gin 

365 



<210> 58 

<211> 326 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc__feature 

<223> Incyte ID No: LG : 403872. 1 .orf 3 :2000MAY19 
<220> 

<221> unsure 
<222> 294 

<223> unknown or other 
<400> 58 



Glu Met 


Ala 


Val 


Gly 


Asn 


Asn 


Thr 


Gin 


Arg 


Ser 


Tyr Ser He 


He 


1 






5 










10 






15 


Pro Cys 


Phe 


He 


Phe 


Val 


Glu 


Leu 


Val 


He 


Met 


Ala Gly Thr 


Val 








20 










25 




30 


Leu Leu 


Ala 


Tyr 


Tyr 


Phe 


GlU 


Cys 


Thr 


Asp 


Thr 


Phe Gin Val 


His 








35 










40 






45 


lie Gin 


Gly 


Phe 


Phe 


Cys 


Gin 


Asp 


Gly 


Asp 


Leu 


Met Lys Pro 


Tyr 








50 










55 






60 


Pro Gly 


Thr 


Glu 


Glu 


Glu 


Ser 


Phe 


He 


Thr 


Pro 


Leu Val Leu 


Tyr 








65 










70 






75 


Cys Val 


Leu 


Ala 


Ala 


Thr 


Pro 


Thr 


Ala 


He 


He 


Phe He Gly 


Glu 








80 










85 




90 


lie Ser 


Met 


Tyr 


Phe 


He 


Lys 


Ser 


Thr 


Arg 


Glu 


Ser Leu He 


Ala 








95 










100 






105 


Gin Glu 


Lys 


Thr 


He 


Leu 


Thr 


Gly 


Glu 


Cys 


Cys 


Tyr Leu Asn 


Pro 








110 










115 






120 


Leu Leu 


Arg 


Arg 


He 


He 


Arg 


Phe 


Thr 


Gly 


Val 


Phe Ala Phe 


Gly 








125 










130 






135 


Leu Phe 


Ala 


Thr 


Asp 


He 


Phe 


Val 


Asn 


Ala 


Gly 


Gin Val Val 


Thr 








140 










145 




150 


Gly His 


Leu 


Thr 


Pro 


Tyr 


Phe 


Leu 


Thr 


Val 


Cys 


Lys Pro Asn 


Tyr 








155 










160 




165 


Thr Ser 


Ala 


Asp 


Cys 


Gin 


Ala 


His 


His 


Gin 


Phe 


He Asn Asn 


Gly 








170 










175 






180 


Asn lie 


Cys 


Thr 


Gly 


Asp 


Leu 


Glu 


Val 


He 


Glu 


Lys Ala Arg 


Arg 








185 










190 






195 


Ser Phe 


Pro 


Set- 


Lys 


His 


Ala 


Ala 


Leu 


Ser 


He 


Tyr Ser Ala 


Leu 








200 










205 




210 


Tyr Ala 


Thr 


Met 


Tyr 


He 


Thr 


Ser 


Thr 


He 


Lys 


Thr Lys Ser 


Ser 








215 










220 


225 


Arg Leu 


Ala 


Lys 


Pro 


Val 


Leu 


Cys 


Leu 


Gly 


Thr 


Leu Cys Thr 


Ala 








230 










235 






240 


Phe Leu 


Thr 


Gly 


Leu 


Asn 


Arg 


Val 


Ser 


Glu 


Tyr 


Arg Asn His 


Cys 








245 










250 




255 


Ser Asp 


Val 


He 


Ala 


Gly 


Phe 


He 


Leu 


Gly 


Thr 


Ala Val Ala 


Leu 








260 










265 






270 


Phe Leu 


Gly 


Met 


Cys 


Val 


Val 


His 


Asn 


Phe 


Lys 


Gly Thr Gin 


Gly 








275 










280 






285 


Ser Pro 


Ser 


Lys 


Pro 


Lys 


Pro 


Glu 


Xaa 


Pro 


Arg 


Gly Val Pro 


Leu 








290 










295 


300 


Met Ala 


Phe 


Pro 


Arg 


He 


Glu 


Ser 


Pro 


Leu 


Glu 


Thr Leu Ser 


Ala 








305 










310 






315 


Gin Asn 


His 


Ser 


Ala 


Ser 


Met 


Thr 


Glu 


Val 


Thr 







320 325 
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<210> 59 

<211> 156 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 1135213 . 1 .orfl :2000MAY19 
<400> 59 

Leu Cys Gly Asp Tyr Ser Cys Leu Thr Thr Glu Phe Pro Thr Glu 
15 10 15 

lie Met Glu Glu Lys Gin Gin lie lie Leu Ala Asn Gin Asp Gly 
20 25 30 

Gly Thr Val Ala Gly Ala Ala Pro Thr Phe Phe Val He Leu Lys 
35 40 45 

Gin Pro Gly Asn Gly Lys Thr Asp Gin Gly He Leu Val Thr Asn 
50 55 60 

Gin Asp Ala Cys Ala Leu Ala Ser Ser Val Ser Ser Pro Val Lys 
65 70 75 

Ser Lys Gly Lys He Cys Leu Pro Ala Asp Cys Thr Val Gly Gly 
80 85 90 

He Thr Val Thr Leu Asp Asn Asn Ser Met Trp Asn Glu Phe Tyr 
95 100 105 

His Arg Ser Thr Glu Met He Leu Thr Lys Gin Gly Arg Arg Met 
110 115 120 

Phe Pro Tyr Cys Arg Tyr Trp He Thr Gly Leu Asp Ser Asn Leu 
125 130 135 

Lys Tyr He Leu Val Met Asp He Ser Pro Val Asp Asn His Arg 
140 145 150 

Tyr Lys Trp Asn Gly Arg 
155 

<210> 60 

<211> 262 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_ feature 

<223> Incyte ID No: LG: 474284 . 2 .orf 2 : 2000MAY19 
<400> 60 

Ser Ser Pro Thr Ser Trp Arg Ser Ser Met Pro Cys Thr Trp Arg 
1 5 10 " 15 

Ser Arg Arg Arg Arg Cys Thr Ala Cys Ser Ala Ala Ala Ala Pro 
20 25 30 

Pro Leu Pro Ala Gin Lys Val Cys Leu Arg Cys Glu Ala Pro Cys 
35 40 45 

Cys Gin Ser His Val Gin Thr His Leu Gin Gin Pro Ser Thr Ala 
50 55 60 

Arg Gly His Leu Leu Val Glu Ala Asp Asp Val Arg Ala Trp Ser 
65 70 75 

Cys Pro Gin His Asn Ala Tyr Arg Leu Tyr His Cys Glu Ala Glu 
80 85 90 

Gin Val Ala Val Cys Gin Tyr Cys Cys Tyr Tyr Ser Gly Ala His 
95 100 105 

Gin Gly His Ser Val Cys Asp Val Glu He Arg Arg Asn Glu He 
110 115 120 

Arg Lys Met Leu Met Lys Gin Gin Asp Arg Leu Glu Glu Arg Glu 
125 130 135 

Gin Asp He Glu Asp Gin Leu Tyr Lys Leu Glu Ser Asp Lys Arg 
140 145 150 

Leu Val Glu Glu Lys Val Asn Gin Leu Lys Glu Glu Val Arg Leu 
155 160 165 

Gin Tyr Glu Lys Leu His Gin Leu Leu Asp Glu Asp Leu Arg Gin 
170 175 180 
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Til IT 


Val 


Glu 


Val 


Leu 


Asp 


L»ys Ala Gin Aia jjys Fne cys ser 


GlU 










185 




ion 

Li* u 






Ala Ala 


Gin 


Ala 


Leu 


nis Lieu tjiy <aiu ATy jxieu i^iu 


Ala 










200 




one 


210 




Lys 


Leu 


Leu Gly 


Ser 


jjeu lain lieu i»eu irne Asp iiys rnr 


GlU 










215 




220 


225 


Asp 


Val 


Ser 


Phe 


Met 


Lys 


Asn Thr Lys Ser Val Lys lie Leu 


Met 










230 


235 - 


240 


Asp 


Ser 


Arg 


Cys 


Pro 


Val 


His Trp Pro Gin Asp Pro Asp Leu 


His 










245 




250 


255 


Glu 


Gin 


Gin 


Pro 


Phe 


Pro 


His 












260 









<210> 61 

<211> 132 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG : 342147 . 1 . orf 3 : 2000MAY19 
<400> 61 



Lys 


Thr 


Asn Leu 


Tyr 


Cys 


Ser 


Pro 


Tyr 


Phe 


He 


Asp Cys Asn 


Arg 


1 






5 










10 






15 


Ser 


He 


Glu Val 


Thr 


Phe 


He 


Leu 


Ser 


Trp 


He 


Val Cys Ser' 


Tyr 








20 










25 




30 


Ala 


Val 


Cys Lys 


Glu 


Arg 


Asn 


Gly 


Met 


Gly 


Gly 


Cys Glu Lys 


Glu 








35 










40 




45 


Glu 


Leu 


Val Val 


Asp 


Phe 


Gly 


Gly 


Ala 


Gly 


Trp 


Arg Ser Leu 


Cys 








50 










55 






60 


Leu 


Cys 


Ser Arg 


Leu 


Gly 


Cys 


Ala 


Ala 


Pro 


Arg 


Pro Arg Cys 


Pro 








65 










70 




75 


Asp 


Phe 


Arg Arg 


Pro 


Asp 


Ala 


Ser 


Leu 


Thr 


Ser 


Ala Ser Ala 


Arg 








80 










85 






90 


^Gly 


Cys 


Trp Arg 


Pro 


Ser 


Trp 


Leu 


Arg 


Ser 


Ala 


Pro Pro Arg 


Ser 








95 










100 




105 


Pro 


Pro 


Thr Thr 


Cys 


Ala 


His 


Pro 


Ala 


Trp 


Arg 


Cys Pro Ser 


Pro 








110 










115 




120 


Arg 


Cys 


Arg Arg 


Thr 


Pro 


Ala 


Pro 


Phe 


Arg 


Cys 


Cys 





125 130 

<210> 62 
<211> 167 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc„feature 

<223> Incyte ID No: LG : 1097300 . 1 . orf 2 : 2000MAY19 
<400> 62 



Pro 


Pro 


Arg 


Arg 


Arg 


Pro 


Cys 


Trp 


Phe 


Leu 


Cys 


Gly Leu Leu 


Ser 


1 








5 










10 






15 


Arg 


Met 


Val 


Lys 


Leu 


Phe 


He 


Gly 


Asn 


Leu 


Pro 


Arg Glu Ala 


Thr 










20 










25 




30 


Glu 


Gin 


Glu 


He 


Arg 


Ser 


Leu 


Phe 


Glu 


Gin 


Tyr 


Gly Lys Val 


Leu 










35 










40 


45 


Glu 


Cys 


Asp 


He 


He 
50 


Lys 


Asn 


Tyr 


Gly 


Phe 
55 


Val 


His He Glu 


Asp 
60 


Lys 


Thr 


Ala 


Ala 


Glu 
65 


Asp 


Ala 


He 


Arg 


Asn 
70 


Leu 


His His His 


Lys 
75 


Pro 


His 


Gly 


Val 


Asn 


He 


Asn 


Ala 


Glu 


Ala 


Ser 


Lys Asn Lys 


Ser 










80 










85 




90 


Lys 


Ala 


Pro 


Thr 


Lys 
95 


Leu 


His 


Val 


Gly 


Asn 
100 


He 


Ser Pro Thr 


Cys 
105 


Thr 


Asn 


Gin 


Glu 


Leu 


Arg 


Ala 


Lys 


Phe 


Glu 


Glu 


His Gly Pro 


Ala 
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110 115 120 

He Glu Cys Asp He Ala Lys Asp Tyr Ala Phe Ala His Met Glu 

125 130 135 

Arg Ala Glu Asp Ala Ala Glu Ala He Arg Gly Leu Asp Asn Thr 

140 145 150 

Glu Phe Gin Gly Glu Leu Leu Trp Ala Trp Val Val Ala Pro Ser 

155 160 165 

Gly Val 



<210> 63 

<211> 570 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 444850. 9 

<220> 

<221> unsure 

<222> 569-570 

<223> unknown or other 

<400> 63 



Lys 


His 


Arg 


Gin 


Glu 


Asn 


Asn 


Ala 


1 








5 








His 


Met 


Thr 


Gly 


Pro 


Met 


Cys 


Leu 








20 






Leu 


Val 


Ala 


Asn 


Pro 
35 


Glu 


Ala 


Leu 


Gin 


Pro 


Val 


Val 


Val 
50 


Val 


Ala 


He 


Lys 


Ser 


Tyr 


Leu 


Mec 
65 


Asn 


Lys 


Leu 


Ser 


Leu 


Gly 


Ser 


Thr 


Val 


Lys 


Ser 










80 






Trp 


Cys 


Val 


Pro 


His 
95 


Pro 


Lys 


Lys 


Leu 


Asp 


Thr 


Glu 


Gly 
110 


Leu 


Gly 


Asp 


Asn 


Asp 


Ser 


Trp 


He 
125 


Phe 


Thr 


Leu 


Leu 


Val 


Tyr 


Asn 


Ser 


Met 


Gly 


Thr 










140 






Gin 


Leu 


Tyr 


Tyr 


Val 
155 


Thr 


Glu 


Leu 


Ser 


Ser 


Pro 


Asp 


Glu 
170 


Asn 


Glu 


Asn 


Ser 


Phe 


Phe 


Pro 


Asp 


Phe 


Val 


Trp 










185 






Asp 


Leu 


Glu 


Ala 


Asp 


Gly 


Gin 


Pro 










200 






Glu 


Tyr 


Ser 


Leu 


Lys 
215 


Leu 


Thr 


Gin 


Asn 


Phe 


Asn 


Leu 


Pro 


Gin 


Leu 


Cys 










230 






Lys 


Lys 


Cys 


Phe 


Val 


Phe 


Asp 


Leu 










245 






Ala 


Gin 


Leu 


Glu 


Lys 


Leu 


Gin 


Asp 










260 






Val 


Gin 


Gin 


Val 


Ala 


Asp 


Phe 


Cys 










275 




Lys 


Thr 


Lys 


Thr 


Leu 


Ser 


Gly 


Gly 










290 






Leu 


Glu 


Ser 


Leu 


Val 


Leu 


Thr 


Tyr 



305 



orf 1:2000MAY19 



Leu 


Asp 


Met 


Ala 


Pro Glu He 




10 






15 


He 


GlU 


Asn 


Thr 


Asn Gly Glu 




25 






30 


Lys 


He 


Leu 


Ser 


Ala He Thr 




40 






45 


Val 


Gly 


Leu 


Tyr 


Arg Thr Gly 




55 






60 


Ala 


Gly 


Lys 


Asn 


Lys Gly Phe 




70 






75 


His 


Thr 


Lys 


Gly 


He Trp Met 




85 






90 


Pro 


Glu 


His 


Thr 


Leu Val Leu 




100 






105 


Val 


Lys 


Lys 


Gly 


Asp Asn Gin 




115 






120 


Ala 


Val 


Leu 


Leu 


Ser Ser Thr 




130 






135 


He 


Asn 


Gin 


Gin 


Ala Met Asp 




145 






150 


Thr 


His 


Arg 


He 


Arg Ser Lys 




160 






165 


Glu 


Asp 


Ser 


Ala 


Asp Phe Val 




175 






180 


Thr 


Leu 


Arg 


Asp 


Phe Ser Leu 




190 






195 


Leu 


Thr 


Pro 


Asp 


Glu Tyr Leu 




205 




210 


Gly 


Thr 


Ser 


Gin 


Lys Asp Lys 




220 






225 


He 


Trp 


Lys 


Phe 


Phe Pro Lys 




235 






240 


Pro 


He 


His 


Arg 


Arg Lys Leu 




250 






255 


Glu 


Glu 


Leu 


Asp 


Pro Glu Phe 




265 




270 


Ser 


Tyr 


He 


Phe 


Ser Asn Ser 




280 






285 


He 


Lys 


Val 


Asn 


Gly Pro Arg 




295 






300 


He 


Asn 


Ala 


He 


Ser Arg Gly 




310 






315 



46/69 



WO 01/62922 PCT/US01/05896 



Aq^ JJtzu. 


Pro Cys 


fits U 


nlii 
ulU 




Al a 
nia 


Val 


ucu 


Al a 
Aid 




Ala Gin 


He 








-j \j 










325 
j t* j 








330 




Ser 


Ala 


Al a 


Ua 1 

val 


Pin 
villi 




Al a 

Aid 


Tl 
He 


Al a 


nib 


Tyr Asp Gin 








335 








340 








345 


V?1I1 Ucl 


Gly Gin 


Lys 


Val 




llcll 


XT X. U 


Al a 


Pi ii 
win 


1X11 


Leu Gin 


Glu 


























360 


IJcU Llcll 


Asp Leu 


Hi e 
Ml S 


Arg 


Val 


Dei 


Pi 11 


ax y 


Pi n 
OlU 


Ala 

Aia 


Thr Glu Val 








fit: 

ODD 








J / u 








375 


Tyr Met 


Lys Asn 




nic 


T.ve 


nop 


Va 1 
veil 


A QT^ 
nop 


nio 


UcU 


Phe Gin Lys 








380 










385 








390 




Ala 


Ala 




±J tr Ll 


nop 








A cn 


nop 


Phe Cys 


Lys 








395 










400 








405 


Gin Acn 


Gin 


Glu 


Ala 


Ocl 


OCl 


no p 


nx y 


Cys 


day 


Al a 
nia 


Leu Leu 


Gin 








410 










415 








420 


Val lie 


Phe 


Ser 


Jr x \J 






ulU 


Old 


Val 

Vul 




Ala 


Gly He Tyr 








425 










430 

T J V 








435 




Pro Gly 




ryr 








Tl a 
11« 


Pi n 
^ni 


T .\rc 

■Liys 


Leu Gin Asp 








AAt) 










Aim 








450 


JJfciU. v3 J. U. 


Lys Lys 


iyr 


iyr 


Pin 
vjIU 


Pi ii 
OIU 




A vrr 

al y 


T,\re 

iiy 


Pi W 


He Gin 


Ala 








*i J J 










AGO 
^fc O V 








465 


Pin Glii 


He 


Leu 


pin 
ill 


lfll 


iyr 


Leu 




Ser 


Lys 


pin 
olU 


Ser Val 


Thr 








Aid 














480 


rtop Ala 


lie 


Leu 


bin 


lfll 


Asp 


t*in 


lie 


Leu 


inr 


blU 


Lys Glu 


Lys 








After 










AQfl 
ft 3 U 






495 


filii Tip 
wlU lie 


Glu Val 


Pi n 


PlfO 

v*y& 


Val 
val 


ijys 


Ala 
Ala 


Pi 11 
ulU 




A Ta 
nia 


Gin Ala 


Ser 






















CIA 

510 


Ala Lys 


Met 


Val 


Glu 


Glu 


Met 


Gin 


He 


Lys 


Tyr 


Gin 


bin Jaeu 


Met 








J 










oz u 








525 


Glu Glu 


Lys 


Glu 


Lys 


Ser 


Tyr 


Gin 


Glu 


His 


Val 


Lys 


Gin Leu 


Thr 








con 










JJ J 








540 


Glu Lys 


Met 


Glu 


Arg 


Glu 


Arg 


Ala 


Gin 


Leu 


Leu 


GlU 


Glu Gin 


Glu 


















CCA 








555 


Lys Thr 


Leu 


Thr 


Ser 


Lys 


Leu 


Gin 


Val 


Ser 


Lys 


Cys 


Lys Xaa 


Xaa 








560 










565 


5/0 


<210> 64 
























<211> 168 
























<21'2> PRT 
























<213> Homo sapiens 




















<220> 


























<221> misc_feature 
















• 




<223> Incyte ID 


No: 


LG:402231. 6 


>orf3 :2000MAY19 






<400> 64 
























Ala Leu 


Phe 


Ser 


Arg 


He 


He 


Gin 


Gin 


Leu 


Val 


Asn 


Gly He 


He 


1 






5 










10 






15 


Thr Pro 


Ala 


Thr 


He 


Pro 


Ser 


Leu 


Gly 


Pro 


Trp 


Gly 


Val Leu 


His 








20 










25 








30 


Ser Asn 


Pro 


Met 


Asp 


Tyr 


Ala 


Trp 


Gly 


Ala 


Asn 


Gly 


Leu Asp 


Ala 








35 










40 








45 


lie He 


Thr 


Gin 


Leu 


Leu 


Asn 


Gin 


Phe 


Glu 


Asn 


Thr 


Gly Pro 


Pro 








50 










55 






60 


Pro Ala 


Asp 


Lys 


Glu 


Lys 


He 


Gin 


Ala 


Leu 


Pro 


Thr 


Val Pro 


Val 








65 










70 








75 


Thr Glu 


Glu 


His 


Val 


Gly 


Ser 


Gly 


Leu 


GlU 


Cys 


Pro 


Val Cys 


Lys 








80 










85 








90 


Asp Asp 


Tyr 


Ala 


Leu 


Gly 


Glu 


Arg 


Val 


Arg 


Gin 


Leu 


Pro Cys 


Asn 








95 










100 








105 


His Leu 


Phe 


His 


Thr 


Thr 


Tyr 


Glu 


Gin 


Ala 


Trp 


Leu 


Glu Gin 


His 








110 








115 






120 


Asp Ser 


Cys 


Pro 


Val 


Cys 


Arg 


Lys 


Ser 


Leu 


Thr 


Gly 


Gin Asn 


Thr 








125 










130 






135 


Ala Thr 


Asn 


Pro 


Pro 


Gly 


Leu 


Thr 


Gly 


Val 


Ser 


Phe 


Ser Ser 


Ser 








140 






145 








150 


Ser Ser 


Ser 


Ser 


Ser 


Ser 


Ser 


Ser 


Pro 


Ser 


Asn 


Glu 


Asn Ala 


Thr 
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155 160 165 

Ser Asn Ser 



<210> 65 
<211> 246 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: LG: 1076157 . 1 . orf 3 : 2000MAY19 
<220> 

<221> unsure 
<222> 240 

<223> unknown or other 
<400> 65 



Pro 


Lys 


Gin Gly 


He Asn Val Trp Ser 


Pro 


Arg 


His Pro 


Glu 


Asn 


1 






5 


10 








15 


Pne 


Leu 


Gly He 


Glu Ser Arg Pro Pro 


Met 


Leu 


Ser Leu 


Ser 


Pro 








20 


25 








30 


lie 


Leu 


Leu Tyr 


Thr Cys Glu Met Phe 


Gin 


Asp 


Pro Val 


Ala 


Phe 








35 


40 








45 


ljy& 


ASp 


Val Ala 


vai Asn fne mr bin 


CalU 


G1U 


Trp Ala 


Leu 


Leu 








50 


55 






60 


Asp 


He 


Ser Gin 


Arg Lys Leu Tyr Arg 


Glu 


Val 


Met Leu 


Glu 


Thr 








65 


70 








75 


Phe 


Arg 


Asn Leu 


Thr Ser He Gly Lys 


Lys 


Trp 


Lys Asp 


Gin 


Asn 








80 


85 






90 


lie 


Glu 


Tyr Glu 


Tyr Gin Asn Pro Arg 


Arg 


Asn 


Phe Arg 


Ser 


Leu 








95 


100 








105 


He 


Glu 


Gly Asn 


Val Asn Glu He Lys 


Glu 


Asp 


Ser His 


Cys 


Gly 








110 


115 






120 


Glu 


Thr 


Phe Thr 


Gin Val Pro Asp Asp 


Arg 


Leu 


Asn Phe 


Gin 


Glu 








125 


130 








135 


Lys 


Lys 


Ala Ser 


Pro Glu Ala Lys Ser 


Cys 


Asp 


Asn Phe 


Val 


Cys 








140 


145 






150 


Gly 


Glu 


Val Gly 


He Gly Asn Ser Ser 


Phe 


Asn 


Met Asn 


He 


Arg 








155 


160 








165 


Gly 


Asp 


He Gly 


His Lys Ala Tyr Glu 


Tyr 


Gin 


Asp Tyr 


Ala 


Pro 








170 


175 








180 


Lys 


Pro 


Tyr Lys 


Cys Gin Gin Pro Lys 


Lys 


Ala 


Phe Arg 


Tyr 


His 








185 


190 








195 


Pro 


Ser 


Phe Arg 


Thr Gin Glu Arg Asn 


His 


Thr 


Gly Glu 


Lys 


Pro 








200 


205 




210 


Tyr 


Ala 


Cys Lys 


Glu Cys Gly Lys Thr 


Phe 


He 


Ser His 


Ser Gly 








215 


220 








225 


He 


Arg 


Arg Arg 


Met Val Met His Ser 


Gly 


Asp 


Gly Pro 


Leu 


Xaa 








230 


235 




240 


Val 


Ser 


Phe Val 


Gly Lys 













245 

<210> 66 
<211> 120 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 1083142 . 1 .orf 3 : 2000MAY19 
<220> 

<221> unsure 
<222> 1 

<223> unknown or other 
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<400> 66 




















Xaa 


Phe 


Pro 


Val 


Leu 


Glu 


Pro 


His 


Gin 


Val 


Gly 


Leu He Arg Ser 


1 








5 










10 




15 


Tyr 


Asn 


Ser 


Lys 


Thr 


Met 


Thr 


Cys 


Phe 


Gin 


GlU 


Leu Val Thr Phe 








20 










25 




30 


Arg 


Asp 


Val 


Ala 


He 
35 


Asp 


Phe 


Ser 


Arg 


Gin 
40 


Glu 


Trp Glu Tyr Leu 
45 


Asp 


Pro 


Asn 


Gin Arg 


Asp 


Leu 


Tyr 


Arg 


Asp 


Val 


Met Leu Glu Asn 










50 










55 




60 


Tyr 


Arg 


Asn 


Leu 


Val 
65 


Ser 


Leu 


Gly 


Gly 


His 
70 


Ser 


He Ser Lys Pro 
75 


Val 


Val 


Val 


Asp 


Leu 
80 


Leu 


Glu 


Arg 


Gly 


Lys 
85 


Glu 


Pro Trp Met He 
90 


Leu 


Arg 


Glu 


Glu 


Thr 
95 


Gin 


Phe 


Thr 


Asp 


Leu 
100 


Asp 


Leu Gin Cys Glu 
105 


He 


He 


Ser 


Tyr 


He 
110 


Glu 


Val 


Pro 


Thr 


Tyr 
115 


Glu 


Thr Asp He Ser 
120 



<210> 67 
<211> 122 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: LG: 1083264 . 1 .orf 2 :2000MAY19 
<400> 67 



Lys 


Lys 


Ser Gin 


Lys 


Glu 


Ser 


Thr 


Gin 


Gin 


Thr 


Arg He His 


Phe 


1 






5 










10 




15 


Gin 


Arg 


Asp He 


Leu 


Cys 


Lys 


Glu 


Ala 


Thr 


Trp 


Lys Arg Lys 


Glu 








20 










25 






30 


Lys 


Lys 


Ser Gly 


Met 


Ala 


Leu 


Thr 


Gin 


Gly 


Pro 


Leu Lys Phe 


Met 








35 










40 




45 


Asp 


Val 


Ala He 


Glu 


Phe 


Ser 


Gin 


Glu 


Glu 


Trp 


Lys Cys Leu Asp 








50 










55 




60 


Pro 


Ala 


Gin Arg 


Thr 


Leu 


Tyr 


Arg 


Asp 


Val 


Met 


Leu Glu Asn 


Tyr 








65 










70 






75 


Arg 


Asn 


Leu Val 


Ser 


Leu 


Gly 


He 


Cys 


Leu 


Pro 


Asp Leu Ser 


Val 








80 










85 




90 


Thr 


Ser 


Met Leu 


Glu 


Gin 


Lys 


Arg 


Asp 


Pro 


Trp 


Thr Leu Gin 


Ser 








95 










100 






105 


Glu 


Glu 


Lys He 


Ala 


Asn 


Asp 


Pro 


Asp 


Gly 


Arg 


Glu Cys He 


Gin 



110 115 120 

Lys Val 



<210> 68 
<211> 428 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG: 350793 .2 .orf 3 :2000MAY19 
<400> 68 

Ala Gin Gly Ser Ser Trp Lys Leu Pro Phe Glu Arg Leu Ala Phe 
15 10 15 

Val Leu Ser Ser Asn Ser Leu Lys His Cys Thr Glu Leu Glu Leu 

20 ~ 25 30 

Phe Lys Ala Thr Cys Arg Trp Leu Arg Leu Glu Glu Pro Arg Met 

35 40 45 

Asp Phe Ala Ala Lys Leu Met Lys Asn He Arg Phe Pro Leu Met 

50 55 60 

Thr Pro Gin Glu Leu He Asn Tyr Val Gin Thr Val Asp phe Met 
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65 


Arg 


Thr 


Asp Asn 


Thr 








80 


Tyr 


Gin 


Met Met 


Pro 








95 


Thr 


Ala 


He Arg 


Ser 








110 


Val 


Leu 


Arg Gin 


Gin 








125 


Asp 


Glu 


Lys Ala 


His 








140 


Pro 


Arg 


Tyr Gin 


His 








155 


Val 


Val 


Gly Gly 


Gin 








170 


Val 


Asp 


Thr Val 


Phe 








185 


Gin 


Val 


Ala Ser 


Leu 








200 


Ala 


Leu 


Lys Gly 


Tyr 








215 


Gly 


Glu 


Leu Pro 


Thr 








230 


Trp 


Thr 


Tyr Val 


Ala 








245 


Gly 


Thr 


Val Tyr 


Gly 








260 


His 


Asp 


Thr Phe 


Gin 








275 


Asp 


Lys 


Trp He 


Gin 








290 


His 


Cys 


Met Cys 


Thr 








305 


Asn 


His 


Phe Arg 


Gly 








320 


Glu 


Tyr 


Tyr Ser 


Pro 








335 


Met 


Leu 


Arg Gly 


Gin 








350 


Lys 


He 


Tyr Val 


Val 








365 


Val 


Glu 


He Val 


Gin 








380 


Lys 


Val 


Phe Asp 


Leu 








395 


Thr 


Leu 


Thr Val 


Phe 








410 


Arg 


Glu 


Ser Pro 


Leu 








425 











70 


Cys 


Val 


Asn 


Leu 


Leu 
85 


Tyr 


Met 


Gin 


Pro 


Val 








100 


Asp 


Thr 


Thr 


His 


Leu 








115 


Leu 


Val 


Val 


Ser 


Lys 
130 


Glu 


Trp 


Lys 


Ser 


Leu 
145 


Gly 


He 


Ala 


Val 


He 
160 


Ser 


Asn 


Tyr 


Asp 


Thr 
175 


Arg 


Phe 


Asp 


Pro 


Arg 
190 


Asn 


Glu 


Lys 


Arg 


Thr 
205 


Leu 


Tyr 


Ala 


Val 


Gly 
220 


Val 


Glu 


Cys 


Tyr 


Asn 
235 


Lys 


Met 


Ser 


Glu 


Pro 
250 


Gly 


Val 


Met 


Tyr 


He 
265 


Lys 


Glu 


Leu 


Met 


Cys 
280 


Lys 


Ala 


Pro 


Met 


Thr 
295 


Val 


Gly 


Glu 


Arg 


Leu 
310 


Thr 


Ser 


Asp 


Tyr 


Asp 
325 


He 


Leu 


Asp 


Gin 


Trp 
340 


Ser 


Asp 


Val 


Gly 


Val 
355 


Gly 


Gly 


Tyr 


Ser 


Trp 
370 


Lys 


Tyr 


Asp 


Pro 


Asp 
385 


Pro 


Glu 


Ser 


Leu 


Gly 
400 


Pro 


Pro 


Glu 


GlU 


Thr 
415 


Ser 


Ala 


Pro 











75 


Leu 


Glu Ala Ser 


Asn 






90 


Met 


Gin Ser Asp 


Arg 






105 


Val 


Thr Leu Gly 


Gly 






120 


Glu 


Leu Arg Met 


Tyr 






135 


Ala 


Pro Met Asp 


Ala 






150 


Gly 


Asn Phe Leu 


Tyr 






165 


Lys 


Gly Lys Thr 


Ala 






180 


Tyr 


Asn Lys Trp 


Met 






195 


Phe 


Phe His Leu 


Ser 






210 


Gly 


Arg Asn Ala 


Ala 






225 


Pro 


Arg Thr Asn 


Glu 






240 


His 


Tyr Gly His 


Ala 






255 


Ser 


Gly Gly He 


Thr 






270 


Phe 


Asp Pro Asp 


Thr 






285 


Thr 


Val Arg Gly 


Leu 






300 


Tyr 


Val He Gly 


Gly 






315 


Asp 


Val Leu Ser 


Cys 






330 


Thr 


Pro He Ala 


Ala 






345 


Ala 


Val Phe Glu 


Asn 






360 


Asn 


Asn Arg Cys 


Met 






375 


Lys 


Asp Glu Trp 


His 






390 


Gly 


He Arg Ala 


Cys 






405 


Thr 


Pro Ser Pro 


Ser 






420 



<210> 69 
<211> 307 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LG:408751 . 3 .orf 2 :2000MAY19 
<400> 69 

Arg Asp Pro Gly Trp Gin He Arg Asp Arg Ala Gly Leu Ala Trp 
15 10 15 

Asn Met Leu Ala Asn Ser Ala Ser Val Arg He Leu He Lys Gly 

20 25 30 

Gly Lys Val Val Asn Asp Asp Cys Thr His Glu Ala Asp Val Tyr 

35 40 45 

He Glu Asn Gly He He Gin Gin Val Gly Arg Glu Leu Met He 
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50 55 60 



Pro 


Glv 


Gly 




Val 


He 


Asp 


Ala 


Thr 

/ v 


Gly 


Ti\/cj 


Leu Val 


Tin 
X Xe 

/ D 


Pro 


Gly 


Gly 


He Asp 
an 


Thr 


Ser 


Thr 


His 


Phe 
ft ^ 


His 


Gin 


Thr Phe 


Met 


Asn 


Ala 


Thr 


cys Val 


Asp 


Asp 


Phe 


Tyr 


His 
i on 

X \J u 


Gly 


Thr 


Lys Ala 


Ala 


Leu 


Val 




Gl v Thr 

11 0 

-L 4» W 


JL XJ.J- 


Met* 


Tie 


Tip 

X JL C 


vjx_y 

JL -L ~> 




Val 


Leu Pro 


Asp 
i 9n 

16U 


Lys 


Glu 


Thr 


Ser Leu 


Val 


Asp 


Ala 


Tyr 


Glu 


Lys 


Cys 


Arg Gly 


Leu 








125 










nn 

JL J \J 








Ala 


Asp 


Pro 


Lys Val 
140 


Cys 


Cys 


Asp 


Tyr 


Ala 


Leu 


His 


Val Gly 


He 


Thr 


Trp 


Trp 


Ala Pro 
155 


Lys 


Val 


Lys 


Ala 


Glu 
1 fin 

low 


Met 


Glu 


Thr Leu 


Val 


Arg 


Glu 


Lys 


Gly Val 


Asn 


Ser 


Phe 


Gin 


Met 


Phe 


Met 


Thr Tyr 


Lys 


















X / D 






xoU 


Asp 


Leu 


Tyr 


Met Leu 

IOj 


Arg 


Asp 


Ser 


Glu 


Leu 
ion 


Tyr 


Gin 


Val Leu 


His 


Ala 


Cys 


Lys 


Asp He 

9nn 


Gly 


Ala 


He 


Ala 


Arg 
n^ 

Z Uj 


Val 


His 


Ala Glu 


Asn 

Z xU 


Gly 


Glu 


Leu 


Val Ala 

91 R 


Glu 


Gly 


Ala 


Lys 


GlU 

99 n 
z z u 


Ala 


Leu 


Asp Leu 


Gly 

ZZD 


He 


Thr 


Gly 


Pro Glu 

91 0 


Gly 


lie 


Glu 


He 


Ser 

z j j 


Arg 


Pro 


Glu Glu 


Leu 


Glu 


Ala 


Glu 


Ala Thr 
245 


His 


Arg 


Val 


He 


Thr 
250 


Arg 


Asp 


Gly Gly 


Asn 
255 


His 


Asp 


Ala 


Ala Ser 
260 


Trp 


Cys 


Ser 


Ala 


His 
265 


His 


Leu 


Tyr Pro 


Cys 
270 


Gin 


Pro 


Ser 


Leu Gly 
275 


His 


Gly 


Pro 


Trp 


Ala 
280 


Asp 


Val 


Lys Glu 


Pro 
285 


Ser 


Ser 


Ser 


Gly Gly 
290 


Gly 


Gin 


Leu 


Gly 


Arg 
295 


Ala 


Ser 


Leu Leu 


Gly 
300 


Leu 


Gly 


Lys 


Leu Tyr 


Leu 


Leu 

















305 



<210> 70 
<211> 198 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI:336120 . l.orf 1:2000MAY01 
<400> 70 



He 


He 


Pro 


Gin 


Arg 


Ser 


Asn 


Gly 


Asp 


Arg 


Trp 


Gly 


Arg Ser 


Leu 


1 








5 










10 








15 


Leu 


Pro 


Ser 


Arg 


Thr 


Phe 


Leu 


Gin 


Ala 


Leu 


Asn 


Leu 


Gly He 


Glu 










20 










25 






30 


Val 


He 


Asn 


Thr 


Thr 


Asp 


Tyr 


Leu 


His 


Phe 


Ser 


Lys 


Glu Cys 


Ser 










35 










40 








45 


Arg 


Ala 


Leu 


Leu 


Lys 


Met 


Gin 


Tyr 


Cys 


Pro 


His 


Cys 


Gin Gly 


Leu 










50 










55 




60 


Ala 


Leu 


Thr 


Lys 


Pro 


Cys 


Met 


Gly 


Tyr 


Cys 


Leu 


Asn 


Val Met 


Arg 










65 










70 








75 


Gly 


Cys 


Leu 


Ala 


His 


Met 


Ala 


GlU 


Leu 


Asn 


Pro 


His 


Trp His 


Ala 










80 










85 






90 


Tyr 


He 


Arg 


Ser 


Leu 


Glu 


Glu 


Leu 


Ser 


Asp 


Ala 


Met 


His Gly 


Thr 










95 










100 






105 


Tyr 


Asp 


He 


Gly 


His 


Val 


Leu 


Leu 


Asn 


Phe 


His 


Leu 


Leu Val 


Asn 










110 










115 








120 


Asp 


Ala 


Val 


Leu 


Gin 


Ala 


His 


Leu 


Asn 


Gly 


Gin 


Lys 


Leu Leu 


Glu 










125 










130 








135 


Gin 


Val 


Asn 


Arg 


He 


Cys 


Gly 


Arg 


Pro 


Val 


Arg 


Thr 


Pro Thr 


Gin 










140 










145 








150 


Ser 


Pro 


Arg 


Cys 


Ser 


Phe 


Asp 


Gin 


Ser 


Lys 


Glu 


Lys 


His Gly 


Met 
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155 160 165 

Lys Thr Thr Thr Arg Asn Ser Glu Glu Thr Leu Ala Asn Arg Arg 

170 175 180 

Lys Glu Phe lie Asn Ser Leu Ser Thr Val Gin Val lie Leu Trp 

185 190 195 

Arg Ser Ser 



<210> 71 

<211> 227 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 234104 . 2 . orf 1 : 2000MAY01 
<400> 71 



Ala 


Thr 


Pro Ser 


Gly 


Arg 


Pro 


Gin 


Ser 


Trp 


Thr 


Arg Phe Ser Leu 


1 






5 










10 




15 


Trp 


Arg 


Gly Pro 


Arg 


Arg 


Thr 


Arg 


Pro 


Ser 


Pro 


Pro Ala Pro Ala 








20 










25 




30 


Pro 


Ala 


Gly Met 


Gly 


Ser 


Glu 


His 


Asp 


Gly 


Arg 


Ser Gly Pro Val 








35 










40 




45 


Leu 


Thr 


Pro Ala 


Asp 


Thr 


Leu 


His 


Pro 


Pro 


Thr 


Arg Leu Gin Pro 








50 










55 




60 


Ser 


Pro 


Pro Asp 


Thr 


His 


Pro 


Gly 


Gly 


Ser 


Ser 


Leu Pro Ala Pro 








65 






70 




75 


Arg 


Pro 


Ala Leu 


Ser 


Cys 


Trp 


Ala 


Arg 


Val 


Phe 


Ala Ser Leu Val 








80 










85 




90 


Arg 


Pro 


Ala Gly 


Phe 


Pro 


Gly 


Gly 


Thr 


His 


Gly 


Ala Pro Gly Met 








95 










100 




105 


Pro 


Leu 


Gly Ser 


Pro 


Ser 


Thr 


Ser 


Thr 


Ala 


Gin 


Trp Pro Tyr Val 








110 










115 




120 


Gin 


Leu 


Val Pro 


Gly 


Pro 


Arg 


Val 


Arg 


Lys 


Thr 


Ala Ser Arg Ser 








125 










130 




135 


His 


Cys 


Gin Glu 


Arg 


Ala 


Glu 


Glu 


Trp 


Ser 


Gly 


Pro Arg Arg Pro 








140 










145 




150 


Trp 


Gly 


Glu Gly 


Asp 


Pro 


Gly 


Pro 


Val 


Thr 


Ala 


Thr Pro Gly Thr 








155 










160 




165 


Pro 


Gly 


Gly Ala 


Pro 


Thr 


Ser 


Ala 


Phe 


Ser 


Cys 


Ala Ala Lys Leu 








170 










175 




180 


Gin 


Lys 


Pro Asp 


Ala 


Gly 


Leu 


Val 


Val 


Ala 


Asn 


Gly Thr Met Cys 








185 










190 




195 


Cys 


Pro 


Ala Lys 


His 


Thr 


Trp 


Arg 


Ser 


Gly 


Pro 


Lys lie Pro lie 








200 










205 




210 


Leu 


Asp 


Phe His 


Pro 


Ala 


Pro 


Ser 


Ser 


Thr 


Pro 


Arg Ser Ala Leu 



215 220 225 

Ser His 



<210> 72 

<211> 122 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: LI : 450887 . 1 . orf 3 : 2000MAY01 
<400> 72 

Ser Val His Phe Ser Arg Lys Gly Phe Val Leu Met Ala Pro Pro 
15 10 15 

Gin Pro Lys Ser Gly Leu Phe Val Gly He Asn Lys Gly His Val 

20 25 30 

Val Thr Lys Arg Glu Leu Pro Pro Arg Pro Cys His Arg Lys Gly 

35 40 45 
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Lys Ser Thr Lys Arg Val Ser Met Val Arg Gly Leu lie Arg Glu 

50 55 60 

Val Ala Gly Phe Ala Pro Tyr Glu Lys Arg lie Thr Glu Leu Leu 

65 70 75 

Lys Val Gly Lys Asp Lys Arg Ala Leu Lys Leu Ala Lys Arg Lys 

80 85 90 

Leu Gly Thr His Lys Arg Ala Lys Lys Lys Arg Glu Glu Met Ala 

95 100 105 

Gly Val Leu Arg Lys Met Arg Ser Ala Gly Thr His Thr Asp Lys 
110 115 120 

Lys Lys 

<210> 73 

<211> 209 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No; LI : 119992 . 3 . orf 2 : 2000MAY01 
<400> 73 

Cys Ser Gin lie Glu Leu Ala lie Glu Leu Asp Ser Thr His Leu 
15 10 15 

Val Thr Leu Gly Gly Val Leu Arg Gin Gin Leu Val Val Ser Lys 
20 25 30 

Glu Leu Arg Met Tyr Asp Glu Arg Ala Gin Glu Trp Arg Ser Leu 
35 40 45 

Ala Pro Met Asp Ala Pro Arg Tyr Gin His Gly Tyr Trp Leu Phe 
50 55 60 

lie Gly Asn Phe Leu Tyr Val Val Gly Gly Gin Ser Asn Tyr Asp 
65 70 75 

Thr Lys Gly Lys Thr Ala Val Asp Thr Val Phe Arg Phe Asp Pro 
80 85 90 

Arg Tyr Asn Lys Trp Met Gin Val Ala Ser Leu Asn Glu Lys Arg 
95 100 105 

Thr Phe Phe His Leu Ser Ala Leu Lys Gly His Leu Tyr Ala Val 
110 115 120 

Gly Gly Arg Ser Ala Ala Gly Glu Leu Gly Thr Val Glu Cys Tyr 
125 130 135 

Asn Pro Arg Met Asn Glu Trp Ser Tyr Val Ala Lys Met Ser Glu 
140 145 150 

Pro His Tyr Gly His Ala Gly Thr Val Tyr Gly Gly Leu Met Tyr 
155 160 165 

lie Ser Gly Gly He Thr His Asp Thr Phe 'Gin Asn Glu Leu Met 
170 175 180 

Cys Phe Asp Pro Asp Thr Asp Lys Trp Met Gin Lys Ala Pro Met 
185 190 195 

Thr Thr Val Arg Gly Leu His Cys Met Cys Thr Arg Trp Arg 
200 205 



<210> 74 
<211> 312 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: LI:197241.2.orf 1:2000MAY01 
<400> 74 

Tyr Ser Arg He Leu He Leu Gin Met Phe He Leu Gly Ala He 
15 10 15 

He Gin He Leu Pro Trp Val Met Ala Ser Gin Asn Ser Lys His 
20 25 30 

His Pro Glu Leu Val Asp Leu Phe Ser Arg Ser Gly He Tyr He 
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35 



Lys 


Gin 


Val Val 


Leu 
50 


Cys 


Lys 


Phe 


Lys 


Gly 


Gin Val 


Tyr 
65 


Thr 


Cys 


Gly 


Arg 


Asp 


Met Gly 


Asp 
80 


Glu 


Gin 


Thr 


Glu 


Gly 


Leu Asn 


Gly 
95 


His 


Asn 


Cys 


Asp 


His 


Thr Val 


Val 
110 


Leu 


Thr 


Glu 


Gly 


Leu 


Asn He 


Phe 
125 


His 


Gin 


Leu 


Ser 


Cys 


Asn Val 


Pro 
140 


Arg 


Gin 


lie 


Arg 


Thr 


He He 


Gly 
155 


Val 


Ala 


Ala 


Trp 


Thr 


Arg Glu 


Ala 
170 


Val 


Tyr 


Thr 


Leu 


Gly 


Cys Leu 


Leu 
185 


Asp 


Pro 


Asn 


Pro 


Arg 


Gin Val 


Ser 
200 


Ala 


Leu 


His 


Leu 


Val 


Ala Ala 


Ser 


Asp 


Gly 


Ala 
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Gly 


Asp 


He Tyr 


Leu 
230 


Leu 


Ala 


Asp 


Ser 


Lys 


Gin Leu 


Asn 
245 


Leu 


Lys 


Lys 


Met 


Glu 


Tyr Lys 


Val 
260 


Asp 


Pro 


Glu 


Gin 


Lys 


He Cys 


He 
275 


Leu 


Ala 


Met 


Cys 


Trp 


Arg Ser 


Val 
290 


Asn 


Ser 


Ser 


Leu 


Ser 


Thr Ser 


Gly 


Ser 


Ser 


Phe 
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Leu 


Tyr 


Val 


Met 


Leu 


Glu 


Met 


Thr 


1 








5 








Ser 


Gin 


Leu 


Ala 


Leu 


Phe 


Ser 


Arg 










20 






Ala 


Glu 


Asp 


Leu 


Ala 


Gly 


Glu 


Ala 










35 






Leu 


Cys 


Ala 


Pro 


Leu 
50 


His 


Ala 


His 


He 


Val 


His 


Pro 


Ala 


Ala 


Arg 


Ser 










65 






Pro 


Gly 


Arg 


Val 


Glu 
80 


Leu 


Arg 


Cys 


Gin 


Val 


Arg 


Trp 


Tyr 
95 


Lys 


Asp 


Gly 


Ala 


Leu 


Gin 


Leu 


Gly 


Ala 


Glu 


Gly 










110 






Pro 


His 


Ala 


Gin 


Pro 


Glu 


Asp 


Ala 










125 






Arg 


His 


Glu 


Ala 


He 


Thr 


Phe 


Asn 





40 










45 


His 


Ser 


Val 


Phe 


Leu 


Ser 


Gin 




55 










60 


His 


Gly 


Pro 


Gly 


Arg 


Ala 


He 




70 










75 


Cys 


Leu 


Val 


Pro 


Arg 


Leu 


Val 




85 










90 


Ser 


Gin 


Val 


Ala 


Ala 


Ala 


Lys 




100 










105 


Asp 


Gly 


Cys 


Val 


Tyr 


Thr 


Phe 




115 










120 


Gly 


lie 


He 


Pro 


Pro 


Pro 


Ser 




130 










135 


Gin 


Ala 


Lys 


Tyr 


Leu 


Lys 


Gly 




145 










150 


Gly 


Arg 


Phe 


His 


Thr 


Val 


Leu 




160 










165 


Met 


Gly 


Leu 


His 


Gly 


Gly 


Gin 




175 










180 


Gly 


Glu 


Lys 


Cys 


Val 


Thr 


Ala 




190 










195 


His 


Lys 


Asp 


He 


Ala 


Leu 


Ser 




205 










210 


Thr 


Val 


Cys 


Val 


Thr 


Thr 


Arg 




220 










225 


Tyr 


Gin 


Cys 


Lys 


Lys 


Met 


Ala 




235 










240 


Val 


Leu 


Val 


Ser 


Gly 


Gly 


His 




250 










255 


His 


Leu 


Lys 


Glu 


Asn 


Gly 


Gly 




265 










270 


Asp 


Gly 


Ala 


Gly 


Arg 


Val 


Phe 




280 










285 


Leu 


Lys 


Gin 


Cys 


Arg 


Leu 


Gly 




295 










300 


Leu 


He 


Trp 


Leu 










310 
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Arg 


Pro 


Ser 


Ser 


Leu 


Ser 


Leu 




10 










15 


Ala 


Val 


Leu 


Pro 


Val 


Gly 


Arg 




25 










30 


Gly 


Glu 


Ala 


Cys 


Trp 


Pro 


Ser 




40 










45 


Pro 


Pro 


Ala 


Pro 


Pro 


Glu 


Arg 




55 










60 


Leu 


Asp 


Leu 


His 


Phe 


Gly 


Ala 




70 










75 


Glu 


Val 


Ala 


Pro 


Ala 


Gly 


Ser 




85 








90 


Leu 


Glu 


Val 


Glu 


Ala 


Ser 


Asp 




100 










105 


Pro 


Thr 


Arg 


Thr 


Leu 


Thr 


Leu 




115 










120 


Gly 


Glu 


Tyr 


Val 


Cys 


Glu 


Thr 




130 










135 


Val 


He 


Leu 


Ala 


GlU 


Pro 


Pro 
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Val Gin Phe Leu Ala 
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145 150 

Leu Glu Thr Thr Pro Ser Pro Leu cys Val 

160 165 

Val Val Gin Glu Gly Glu Gly Leu Glu Leu 

175 180 
Ala Glu Ser Leu His 
190 
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Arg Thr Cys 
1 

Arg Val Arg 


Ala 


Pro 


Gly 


Ala 


Ala 


Ala 


Arg 


Arg 


Arg 


Ala 


Pro 


Leu 


Ala 


Leu 


Leu 


Gly 


Glu 


Arg 


He 


Met 


Ser 


Pro 


Asp 


Thr 


Pro 


Phe 


Asp 


Arg 


Cys 


Pro 


Met 


Thr 


Thr 


Arg 


Asn 


Gly 


Pro 


Ala 


Trp 


He 


Gin 


Ser 


Phe 


Glu 


Gin 


Cys 


He 


Arg 


Glu 


Gly 


Lys 


Lys 


Ser 


Phe 



Cys 


Arg 
5 


Val 


Leu 


Arg 
20 


Arg 


Arg 


Ala 
35 


Ala 


Gly 


Ser 
50 


Gly 


Ser 


Val 
65 


Gly 


Pro 


Gly 
80 


Leu 


Ser 


His 
95 


Trp 


Thr 


Ala 
110 


Pro 


He 


Tyr 
125 


Lys 


Val 


Asp 
140 


Met 


Thr 


Pro 
155 


Tyr 


Pro 


Asp 
170 


Tyr 


Gly 


Asn 
185 


Asn 


Lys 


Val 
200 


Cys 


Ser 


Pro 
215 


Ala 


Leu 


Met 
230 


Thr 


Glu 


Arg 
245 


His 


His 


Glu 
260 


Thr 


Cys 


Pro 
275 


Cys 


Leu 


Glu 
290 


Tyr 



Val Pro Glu 
Arg Gin Arg 
Leu Leu Val 
Arg Leu Ser 
Gly Ala Gly 
Pro Pro Ser 
Asp Pro Thr 
Gin Cys Leu 
Glu Pro Pro 
Thr Lys He 
Glu Gly Gly 
Pro He His 
Thr Val Arg 
Leu Ser He 
Gin Ser He 
Glu Asn Pro 
Pro Gly Asp 
He Arg Val 
Pro Glu Pro 
Tyr Asp Phe 



Ala 


Lys 


Gin 


10 






Arg 


Ala 


Pro 


25 






Leu 


Leu 


Ala 


40 






Cys 


Arg 


Met 


55 






Gly 


Pro 


Gly 


70 






Ala 


Ala 


Ala 


85 






Leu 


Ser 


Ser 


100 






Leu 


Arg 


He 


115 






Pro 


Gly 


Met 


130 






His 


Ala 


Leu 


145 






Phe 


Phe 


Leu 


160 






Pro 


Pro 


Arg 


175 






Phe 


Asn 


Pro 


190 






Leu 


Gly 


Thr 


205 






Ser 


Ser 


Val 


220 






Tyr 


His 


Asn 


235 






Ser 


Lys 


Asn 


250 






Ala 


Val 


Cys 


265 






Leu 


Arg 


Gly 



280 
Tyr 
295 



Arg Trp 


Arg 




15 


Gly Arg 


Arg 




30 


Leu Ala 


Ala 




45 


Cys Gly 


Arg 




60 


Ser Gly 


Leu 




75 


His Gly 


Ala 




90 


Asp Trp 


Asp 




105 


Lys Arg 


Asp 




120 


Phe Val 


Val 




135 


He Thr 


Gly 




150 


Phe Val 


Phe 




165 


Val Lys 


Leu 




180 


Asn Phe 


Tyr 




195 


Trp Thr 


Gly 




210 


Leu He 


Ser 




225 


Glu Pro 


Gly 




240 


Tyr Asn 


Glu 




255 


Asp Met 


Met 




270 


Val Met 


Glu 




285 
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Ala 


Pro 


Arg 


Leu 


Trp 


Ala 


Cys 


1 








5 






Ser 


Gly 


Pro 


Pro 


Ala 


Arg 


Cys 










20 






Gly 


Gin 


lie 


Glu 


Arg 


Arg 


Gly 










35 






Leu 


Cys 


Gly 


Ser 


Ala 


Ala 


Val 










50 






Glu 


Arg 


Asp 


Ser 


Ala 


Ala 


Val 










65 






Asp 


He 


Asn 


Val 


He 


Thr 


Gly 










80 






Leu 


Pro 


Thr 


Pro 


Leu 


He 


Thr 










95 






Glu 


Ala 


Met 


Ala 


Pro 


Gly 


Thr 










110 




Leu 


Arg 


Ala 


Pro 


Glu 


Gly 


Ser 










125 




Arg 


Ala 


Thr 


Leu 


Thr 


Leu 


Leu 










140 






Ser 


Phe 


His 


Ala 


Tyr 


Asn 


Arg 










155 






Cys 


Phe 


Gly 


Pro 


Val 


Leu 


Leu 










170 






Pro 


Arg 


Ala 


Arg 


Ser 


Ser 


Gly 










185 






Phe 


Lys 


His 


His 


He 


Glu 


Val 










200 






Pro 


Asp 


Pro 


Arg 


Leu 


Pro 


Arg 










215 






Leu 


Arg 


Pro 


Lys 


Arg 


Gin 


Pro 










230 






Pro 


Glu 


Val 


Val 


Thr 


Arg 


Pro 










245 




Pro 


Pro 


Ser 


Asn 


Arg 


Tyr 


Ala 










260 




Gly 


Leu 


Pro 


Asp 


Leu 


Trp 


Ala 










275 






Arg 


Pro 


Leu 











Pro 


Cys 


His 


Cys 


Trp Trp 


Ser Gly 






10 






15 


Pro 


Tyr 


He 


He 


Gin Lys 


Cys Val 






25 






30 


Leu 


Arg 


Val 


Val 


Gly Leu 


Tyr Arg 






40 






45 


Lys 


Lys 


Glu 


Leu 


Arg Asp 


Ala Phe 






55 






60 


Cys 


Leu 


Ser 


Glu 


Asp Leu 


Tyr Pro 






70 






75 


He 


Leu 


Lys 


Asp 


Tyr Leu 


Arg Glu 






85 






90 


Gin 


Pro 


Leu 


Tyr 


Lys Val 


Val Leu 






100 






105 


Pro 


Gin 


Thr 


Glu 


Phe Pro 


Pro Pro 






115 






120 


Tyr 


Ser 


Cys 


Leu 


Pro Asp 


Val Glu 






130 






135 


Leu 


Asp 


His 


Leu 


Arg Leu 


Val Ser 






145 






150 


Met 


Thr 


Pro 


Gin 


Asn Leu 


Ala Val 






160 






165 


Pro 


Ala 


Arg 


Gin 


Ala Pro 


Thr Arg 






175 






180 


Pro 


Gly 


Leu 


Ala 


Ser Ala 


Val Asp 






190 






195 


Leu 


His 


Tyr 


Leu 


Leu Gin 


Ser Trp 






205 






210 


Gin 


Ser 


Pro 


Asp 


Val Ala 


Pro Tyr 






220 




225 


Pro 


Leu 


His 


Leu 


Pro Leu 


Ala Asp 






235 






240 


Arg 


Gly 


Arg 


Gly 


Gly Pro 


Glu Ser 






250 






255 


Gly 


Asp 


Trp 


Ser 


Val Cys 


Gly Arg 






265 






270 


Gly 


Phe 


Pro 


Val 


Arg Ala 


Arg Leu 
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285 
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Leu 


Ala 


Ala 


Pro 


Gin 


Ser 


His 


Ser 


He 


Pro 


Ser 


Pro Pro 


Gly Ala 


1 








5 










10 






15 


His 


Leu 


Leu 


Lys 


Thr 


Arg 


Val 


Leu 


Pro 


Ser 


Ala 


Arg Arg 


Ala Arg 










20 










25 






30 


Ala 


Arg 


Gly 


Ala 


Arg 


Glu 


Leu 


Arg 


Ser 


Ala 


Arg 


Ala Met 


Gly Pro 










35 










40 






45 


Pro 


Pro 


Gly 


Ala 


Gly 


Val 


Ser 


Cys 


Arg 


Gly 


Gly 


Cys Gly 


Phe Ser 










50 










55 






60 


Arg 


Leu 


Leu 


Ala 


Trp 


Cys 


Phe 


Leu 


Leu 


Ala 


Leu 


Ser Pro 


Gin Ala 










65 










70 






75 


Pro 


Gly 


Ser 


Arg 


Gly 


Ala 


Glu 


Ala 


Val 


Trp 


Thr 


Ala Tyr 


Leu Asn 










80 










85 






90 


Val 


Ser 


Trp 


Arg 


Val 


Pro 


His 


Thr 


Gly 


Val 


Asn 


Arg Thr 


Val Trp 










95 










100 






105 


Glu 


Leu 


Ser 


Glu 


Glu 


Gly 


Val 


Tyr 


Gly 


Pro 


Asp 


Ser Pro 


Leu Glu 
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110 115 120 



Pro 


Val 


Ala 


Gly Val 


Leu 


Val 


Pro 


Pro 


Asp 


Gly 


Pro Gly Ala 


Leu 








125 










130 






135 


Asn 


Ala 


Cys 


Asn Pro 


His 


Thr 


Asn 


Phe 


Thr 


Val 


Pro Thr Val 


Trp 






140 










145 






150 


Gly 


Ser 


Thr 


Val Gin 


Val 


Ser 


Trp 


Leu 


Gly 


Leu 


lie Gin Arg 


Gly 








155 










160 






165 


Gly 


Gly 


Cys 


Thr Phe 


Ala 


Asp 


Lys 


lie 


His 


Leu 


Ala Tyr Glu 


Arg 








170 










175 






180 


Gly 


Ala 


Ser 


Gly Ala 


Val 


lie 


Phe 


Asn 


Phe 


Pro 


Gly Thr Arg 


Asn 








185 










190 






195 


Glu 


Val 


lie 


Pro Met 


Ser 


His 


Pro 


Gly 


Ala 


Val 


Asp lie Val 


Ala 








200 










205 






210 


lie 


Met 


lie 


Arg Gin 


Ser 


Glu 


Arg 


His 


Lys 


Asn 


Ser Ala lie 


Tyr 








215 








220 






225 


Ser 


Lys 


Arg 


His Thr 


Ser 


Asp 


Asn 


Gly 


His 


Arg 


Ser Arg Glu 


Lys 








230 










235 






240 


Thr 


Trp 


Pro 


Leu Gly 


Glu 


Ser 


Leu 


Phe 


Asn 


Phe 


Phe Arg Phe 


Leu 






245 










250 






255 


Cys 


Pro 


Phe 


Leu Leu 


Leu 


Arg 


Arg 


Ala 


Thr 


Val 


Gly Tyr Phe 


lie 








260 










265 






270 


Phe 


Tyr 


Ser 


Ala Arg 


Arg 


Leu 


Arg 


Asn 


Ala 


Arg 


Ala Gin Ser 


Arg 








275 










280 






285 


Lys 


Gin 


Arg 


Pro lie 


Lys 


Gly 


Arg 


Cys 
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Gly 


Ala 


Thr Pro 


Arg 


Ala Gly 


Glu Arg 


Ala 


Pro 


Leu Leu Pro 


Asp 


1 






5 






10 






15 


Arg 


Ala 


Ala His 


Ala 


Ala Ser 


Gly Thr 


He 


Thr 


Val Ala Gly 


Arg 








20 






25 






30 


Arg 


Pro 


Val Gin 


He 


Leu Ser 


Glu Phe 


Phe 


Gly 


Ala Phe Ser 


Pro 








35 






40 




45 


Arg 


Lys 


Leu Ala 


He 


Gin Lys 


Cys Ala 


Ser 


Arg 


Thr Ala Ala 


Ala 








50 




55 






60 


Met 


Gly 


Ser Glu 


Asp 


His Gly 


Ala Gin 


Lys 


Pro 


Ser Cys Lys 


He 








65 






70 






75 


Met 


Thr 


Phe Arg 


Pro 


Thr Met 


Gly Glu 


Phe 


Lys 


Asp Phe Asn 


Lys 








80 






85 




90 


Tyr 


Val 


Gly Tyr 


He 


Glu Ser 


Gin Gly 


Ala 


His 


Arg Ala Gly 


Leu 








95 






100 






105 


Gly 


Lys 


He He 


Pro 


Pro Lys 


Glu Trp 


Lys 


Pro 


Arg Gin Thr 


Tyr 








110 






115 






120 


Asp 


Asp 


He Asp 


Asp 


Val Val 


He Pro 


Gly 


Pro 


He Gin Gin 


Val 








125 






130 






135 


Val 


Thr 


Gly Gin 


Ser 


Gly Leu 


Phe Thr 


Gin 


Tyr 


Asn He Gin 


Lys 








140 




145 






150 


Lys 


Gly 


Met Thr 


Val 


Gly Glu 


Tyr Arg 


Arg 


Leu 


Gly Asn Ser 


Glu 








155 






160 






165 


Lys 


Tyr 


Cys Thr 


Pro 


Arg Asp 


Gin Asp 


Phe 


Asp 


Asp Leu Glu 


Arg 








170 






175 






180 


Lys 


Tyr 


Trp Glu 


Gly 


Thr Leu 


Thr Leu 


Cys 


Leu 


Pro Asp Leu 


Arg 








185 






190 






195 



Gly 
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Glu 


Gly 


Trp 


Thr 


Gin 


Pro 


Gin 


Gin Ala 


Gly 


Glu 


Gly Pro His 


Pro 


1 








5 








10 






15 


Ala 


Ala 


His 


Glu 


Cys 


Leu 


His 


Asp Leu 


Gin 


Gin 


Ala Ala Pro 


Gly 










20 








25 






30 


Pro 


Gly 


Pro 


Pro 


Ala 


Ser 


Ser 


Gin Pro 


Gly 


Gin 


Pro Asp Arg 


Gin 










35 








40 






45 


Gin 


Asp 


Pro 


Gly 


Arg 


Val 


Val 


Val Cys 


Pro 


Gly 


Ala Gin Gly 


Glu 










50 








55 






60 


Ala 


Glu 


Val 


Pro 


Arg 


Pro 


Gly 


Leu Pro 


Gly 


Glu 


Gly Gly Pro 


Leu 










65 








70 






75 


Gin 


Gly 


Pro 


Pro 


Ser 


He 


Gly 


Ser Gly 


Ala 


Thr 


Arg Thr Glu 


Arg 










80 








85 






90 


Ser 


Pro 


Ala 


Gin 


Arg 


Pro 


Ser 


Pro Arg 


Ser 


Leu 


Gly Leu Ala 


Gly 










95 








100 






105 


Gly 


His 


Lys 


Glu 


Thr 


Arg 


Glu 


Arg Ser 


Met 


Ser 


Glu Thr Gly 


Thr 










110 








115 




120 


Ala 


Ala 


Cys 


Pro 


Trp 


Val 


Cys 


Pro Arg 


Glu 


Leu 


Leu Ser Val 


Ala 










125 








130 






135 


Ala 


Gin 


Thr 


Leu 


Leu 


Ser 


Ser 


Asp Thr 


Lys 


Ala 


Pro Gly Ser 


Ser 










140 








145 




150 


Ser 


Cys 


Gly 


Ala 


Glu 


Arg 


Leu 


His Thr 


Val 


Gly 


Gly Pro Gly 


Ser 










155 








160 




165 


Ala 


Arg 


Pro 


Arg 


Ala 


Phe 


Ser 


His Ser 


Gly 


Val 


His Ser Leu 


Asp 










170 








175 






180 


Gly 


Gly 


Glu 


Val 


Asp 


Ser 


Gin 


Ala Leu 


Gin 


Glu 


Leu Thr Gin 


Met 










185 








190 






195 


Val 


Ser 


Gly 


Pro 


Ala 


Ser 


Tyr 


Ser Gly 


Pro 


Lys 


Pro Ser Thr 


Gin 










200 








205 




210 


Tyr 


Gly 


Ala 


Pro 


Gly 


Pro 


Phe 


Ala Ala 


Pro 


Gly 


Glu Gly Gly 


Ala 










215 








220 




225 


Leu 


Ala 


Ala 


Thr 


Gly 


Arg 


Pro 


Pro Leu 


Leu 


Pro 


Thr Arg Ala 


Ser 










230 








235 




240 


Arg 


Ser 


Gin 


Arg 


Ala 


Ala 


Ser 


Glu Asp 


Met 


Thr 


Ser Asp Glu 


Glu 










245 






250 




255 


Arg 


Met 


Val 


lie 


Cys 


Glu 


Glu 


Glu Gly 


Asp 


Asp 


Asp Val He 


Ala 










260 








265 






270 


Asp 


Asp 


Gly 


Phe 


Gly 


Pro 


Thr 


Asp Leu 


Asp 


Leu 


Lys Cys Lys 


Glu 










275 








280 






285 


Arg 


Val 


Thr 


Asp 


Ser 


Glu 


Ser 


Gly Asp 


Ser 


Ser 


Gly Glu Asp 


Pro 










290 








295 






300 


Glu 


Gly 


Asn 


Lys 


Gly 


Phe 


Gly 


Arg Lys 


Val 


Phe 


Ser Pro Val 


He 










305 








310 






315 


Arg 


Ser 


Ser 


Phe 


Thr 


His 


Cys 


Arg Pro 


Pro 


Leu 


Asp Pro Glu 


Pro 










Ton 
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Pro 


Gly 


Pro 


Pro 


Asp 


Pro 


Pro 


Val Ala 


Phe 


Gly 


Lys Gly Tyr 


Gly 










335 








340 






345 


Ser 


Ala 


Pro 


Ser 


Ser 


Ser 


Ala 


Ser Ser 


Pro 


Ala 


Ser Ser Ser 


Ala 










350 








355 






360 


Ser 


Ala 


Ala 


Thr 


Ser 


Phe 


Ser 


Leu Gly 


Ser 


Gly 


Thr Phe Lys 


Ala 










365 








370 


375 


Gin 


Glu 


Ser 


Gly 


Gin 


Gly 


Ser 


Thr Ala 


Gly 


Pro 


Leu Arg Pro 


Pro 










380 








385 






390 


Pro 


Pro 


Gly 


Ala 


Gly 


Gly 


Pro 


Ala Thr 


Pro 


Ser 


Lys Ala Thr 


Arg 










395 








400 






405 


Phe 


Leu 


Pro 


Met 


Asp 


Pro 


Ala 


Thr Phe 


Arg 


Arg 


Lys Arg Pro 


Glu 










410 








415 






420 


Ser 


Val 


Gly 


Gly 


Leu 


Glu 


Pro 


Pro Gly 


Pro 


Ser 


Val He Ala 


Ala 










425 






430 






435 


Pro 


Pro 


Ser 


Gly 


Gly 


Gly 


Asn 


He Leu 


Gin 


Thr 


Leu Val Leu 


Pro 
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440 










445 






450 


Pro 


Asn 


Lys Glu 


Glu 
455 


Gin 


Glu 


Gly 


Gly 


Gly 
460 


Ala 


Arg Val 


Pro Ser 
465 


Ala 


Pro 


Ala Pro 


Ser 
470 


Leu 


Ala 


Tyr 


Gly 


Ala 
475 


Pro 


Ala Ala 


Pro Leu 
480 


Ser 


Arg 


Pro Ala 


Ala 


Thr 


Met 


Val 


Thr 


Asn 


Val 


Val Arg 


Pro Val 








485 










490 




495 


Ser 


Ser 


Thr Pro 


Val 
500 


Pro 


He 


Ala 


Ser 


Lys 
505 


Pro 


Phe Pro 


Thr Ser 
510 


Gly 


Arg 


Ala Glu 


Ala 
515 


Ser 


Pro 


Asn 


Asp 


Thr 
520 


Ala 


Gly Ala 


Arg Thr 
525 


Glu 


Met 


Gly Thr 


Gly 
530 


Ser 


Arg 


Val 


Pro 


Gly 
535 


Gly 


Ser Pro 


Leu Gly 
540 


Val 


Ser 


Leu Val 


Tyr 
545 


Ser 


Asp 


Lys 


Lys 


Ser 
550 


Ala 


Ala Ala 


Thr Ser 
555 


Pro 


Ala 


Pro His 


Leu 


Val 


Ala 


Gly 


Pro 


Leu 


Leu 


Gly Thr 


Val Gly 








560 










565 




570 


Lys 


Ala 


Pro Ala 


Thr 
575 


Val 


Thr 


Asn 


Leu 


Leu 
580 


Val 


Gly Thr 


Pro Gly 
585 


Tyr 


Gly 


Ala Pro 


Ala 
590 


Pro 


Pro 


Ala 


Val 


Gin 
595 


Phe 


He Ala 


Gin Gly 
600 


Ala 


Pro 


Gly Gly 


Gly 
605 


Thr 


Thr 


Ala 


Gly 


Ser 
610 


Gly 


Ala Gly 


Ala Gly 
615 


Ser 


Gly 


Pro Asn 


Gly 
620 


Pro 


Val 


Pro 


Leu 


Gly 
625 


He 


Leu Gin 


Pro Gly 
630 


Ala 


Leu 


Gly Lys 


Ala 
635 


Gly 


Gly 


He 


Thr 


Gin 
640 


Val 


Gin Tyr 


He Leu 
645 


Pro 


Thr 


Leu Pro 


Gin 
650 


Gin 


Leu 


Gin 


Val 


Ala 
655 


Pro 


Ala Pro 


Ala Pro 
660 


Ala 


Pro 


Gly Thr 


Lys 


Ala 


Ala 


Ala 


Pro 


Met 


Arg 


Pro Cys 


Thr His 








665 










670 


675 


His 


Gin 


His Pro 


Phe 
680 


His 


Pro 


Pro 


Thr 


Gly 
685 


His 


Phe His 


Gin Arg 
690 


Gin 


Ser 


Pro Gly 


Cys 
695 


His 


Cys 


Thr 


His 


Ser 
700 


Trp 


His Pro 


His Pro 
705 


Ala 


Val 


Cys Thr 


Leu 
710 


Arg 


Pro 


Thr 


Pro 


Gin 
715 


Ser 


Pro Val 


Ser Phe 
720 


Ser 


Arg 


Ala Gly 


Pro 
725 


Ala 


Pro 


Gly 


Trp 


Leu 
730 


Ser 


Pro Ala 


Ala Ala 
735 


Trp 


Glu 


Gly Pro 


Ser 
740 


Ala 


Ser 


Gly 


Arg 


Pro 
745 
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Leu 


Ala 


Met 


Lys 


Asp 


Met 


Leu 


Thr 


Val 


Val 


Asp 


Leu Leu 


Leu Glu 


1 








5 










10 




15 


Gly 


Gly 


Ala 


Asp 


Val 


Asp 


His 


Thr 


Asp 


Asn 


Asn 


Gly Arg 


Thr Pro 










20 










25 






30 


Leu 


Leu 


Ala 


Ala 


Ala 


Ser 


Met 


Gly 


His 


Ala 


Ser 


Val Val 


Asn Thr 










35 








40 






45 


Leu 


Leu 


Phe 


Trp 


Gly 


Ala 


Ala 


Val 


Asp 


Ser 


He 


Asp Ser 


Glu Gly 










50 










55 




60 


Arg 


Thr 


Val 


Leu 


Ser 


He 


Ala 


Ser 


Ala 


Gin 


Gly 


Asn Val 


Glu Val 










65 










70 




75 


Val 


Arg 


Thr 


Leu 


Leu 


Asp 


Arg 


Gly 


Leu 


Asp 


Glu 


Asn His 


Arg Asp 










80 










85 






90 


Asp 


Ala 


Gly 


Trp 


Thr 


Pro 


Leu 


His 


Met 


Ala 


Ala 


Phe Glu 


Gly His 










95 










100 






105 


Arg 


Leu 


He 


Cys 


Glu 


Ala 


Leu 


He 


Glu 


Gin 


Gly 


Ala Arg 


Thr Asn 
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110 




GlU 


lie 


Asp 


Asn Asp 


Gly Arg 








125 




GlU 


Gly 


His 


Tyr Asp 


Cys Val 








140 




Asn 


lie 


Asp 


Gin Arg 


Gly Tyr 








155 


Ala 


Ala 


Leu 


Glu Gly 


His Arg 








170 


His 


Gly 


Ala 


Asp Val 


Asn Cys 








185 




Leu 


Tyr 


lie 


Leu Ala 


Leu Glu 








200 




Phe 


Leu 


Glu 


Asn Gly 


Ala Asn 








215 




Arg 


Thr 


Ala 


Leu His 


Val Ser 








230 




Gly 


Ala 


Gly 


Pro Asp 


Ser He 








245 




Gin 











115 






120 


He Pro Phe 


He 


Leu Ala Ser 


Gin 


ion 
130 






135 


Gin He Leu 


Leu 


Glu Asn Lys 


Ser 


145 




150 


Asp Gly Arg 


Asn 


Ala Leu Arg 


Val 


160 




165 


Asp He Val 


Glu 


Leu Leu Phe 


Ser 


175 






180 


Lys Asp Ala 


Asp 


Gly Arg Pro 


Thr 


190 






195 


Asn Gin Leu 


Thr 


Met Ala Glu 


Tyr 


205 






210 


Val Glu Ala 


Ser 


Asp Ala Glu 


Gly 


220 






225 


Cys Trp Gin 


Gly 


His Met Gly 


Asn 


235 






240 


Pro Cys Arg 


Arg 


Gin Cys Cys 


Arg 


250 






255 
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Met 


Pro 


He 


Leu Pro 


He 


Ser Val Gin 


Leu 


Asp 


Ala Ser Leu 


Leu 


1 






5 






10 




15 


He 


Cys 


Leu 


Val He 


Cys 


Ala Gly Arg 


Phe 


Trp 


Thr Asn Leu 


Tyr 








20 






25 






30 


Ser 


Leu 


Thr 


Val Pro 


Phe 


Gly Gin Lys 


Pro 


Asn 


He Asp Val 


Thr 








35 






40 






45 


Asp 


Ala 


Met 


Val Asp 


Gin 


Ala Trp Asp 


Ala 


Gin 


Arg He Phe 


Lys 








50 






55 






60 


Glu 


Ser 


Ala 


Glu Leu 


Leu 


Cys He Cys 


Trp 


Ser 


Ser Leu Tyr 


Asp 








65 






70 




75 


Ser 


Arg 


He 


Leu Arg 


Gin 


He Pro Cys 


Tyr 


Thr 


Asp Pro Gly 


Asn 








80 






85 




90 


Val 


Gin 


Lys 


Ala Leu 


Cys 


His Pro His 


Ser 


Leu 


Gly Pro Gly 


Glu 








95 






100 






105 


Gly 


Arg 


Leu 


Gin Arg 


Ser 


Leu Cys Ala 


Gin 


Arg 


Val Thr Met 


Asp 








110 






115 






120 


Asp 


Phe 


Leu 


Thr Ala 


His 


His Glu Met 


Gly 


His 


He Gin Tyr 


Asp 








125 






130 




135 


Met 


Ala 


Tyr 


Ala Gly 


Gin 


Pro Phe Ser 


Ala 


Lys 


Glu Met Glu 


Leu 








140 






145 




150 


Asn 


Glu 


Gly 


Phe His 


GlU 


Ala Val Gly 


Glu 


He 


Met Ser Leu 


Ser 








155 




160 






165 


Ala 


Ala 


Thr 


Pro Lys 


His 


Leu Lys Ser 


He 


Gly 


Leu Leu Ser 


Pro 








170 






175 




180 


Glu 


Phe 


Ser 


Thr Asn 


Asp 


Asn Glu Thr 


Glu 


He 


Asn Phe Leu 


Leu 








185 






190 






195 


Lys 


Gin 


Ala 


Leu Thr 


He 


Val Gly Thr 


Leu 


Pro 


Phe Thr Tyr 


Met 








200 






205 






210 


Leu 


Glu 


Lys 


Trp Arg 


Trp 


Met Val Phe 


Lys 


Arg 


Gly Asn Ser 


Gin 








215 






220 




225 


Arg 


Pro 


Val 


Gly Glu 


Lys 


Gly Gly Gly 


Arg 














230 






235 
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<400> 83 



Asn 


Met 


Ala 


Gin 


Phe 


Tyr 


Tyr 


Lys 


Arg 


Asn 


Val 


Asn Ala Pro 


Tyr 


1 








5 










10 






15 


Arg 


Asp 


Arg 


He 


Pro 


Leu 


Arg 


He 


Val 


Arg 


Ala 


Glu Ser Glu 


Leu 










20 










25 






30 


Ser 


Pro 


Ser 


Glu 


Lys 


Ala 


Tyr 


Leu 


Asn 


Ala 


Val 


Glu Lys Gly 


Asp 










35 










40 






45 


Tyr 


Ala 


Ser 


Val 


Lys 


Lys 


Ser 


Leu 


Glu 


Glu 


Ala 


Glu He Tyr 


Phe 










50 










55 




60 


Lys 


He 


Asn 


He 


Asn 


Cys 


lie 


Asp 


Pro 


Leu 


Gly 


Arg Thr Ala 


Leu 










65 










70 




75 


Leu 


He 


Ala 


He 


Glu 


Asn 


Glu 


Asn 


Leu 


Glu 


Leu 


He Glu Leu 


Leu 










80 










85 






90 


Leu 


Ser 


Phe 


Asn 


Val 


Tyr 


Val 


Gly 


Asp 


Ala 


Leu 


Leu His Ala 


He 










95 






100 






105 


Arg 


Lys 


Glu 


Val 


Val 


Gly 


Ala 


Val 


Glu 


Leu 


Leu 


Leu Asn His 


Lys 










110 










115 






120 


Lys 


Pro 


Ser 


Gly 


Glu 


Lys 


Gin 


Val 


Pro 


Pro 


He 


Leu Leu Asp 


Lys 










125 










130 




135 


Gin 


Phe 


Ser 


Glu 


Phe 


Thr 


Pro 


Asp 


He 


Thr 


Pro 


He He Leu 


Ala 










140 








145 






150 


Ala 


His 


Thr 


Asn 


Asn 


Tyr 


Glu 


He 


He 


Lys 


Leu 


Leu Val Gin 


Lys 










155 










160 






165 


Gly 


Val 


Ser 


Val 


Pro 


Arg 


Pro 


His 


Glu 


Val 


Arg 


Cys Asn Cys 


Val 










170 










175 




180 


Glu 


Cys 


Val 


Ser 


Ser 


Ser 


Asp 


Val 


Asp 


Ser 


Leu 


Arg His Ser 


Arg 










185 










190 






195 


Ser 


Arg 


Leu 


Asn 


He 


Tyr 


Lys 


Ala 


Leu 


Ala 


Ser 


Pro Ser Leu 


He 










200 










205 






210 


Ala 


Leu 


Ser 


Ser 


Glu 


Asp 


Pro 


Phe 


Leu 


Thr 


Ala 


Phe Gin Leu 


Ser 










215 








220 






225 


Trp 


Glu 


Leu 


Gin 


Glu 


Leu 


Ser 


Lys 


Val 


Glu 


Asn 


Glu Phe Lys 


Ser 










230 










235 




240 


Glu 


Tyr 


Glu 


Glu 


Leu 


Ser 


Arg 


Gin 


Cys 


Lys 


Gin 


Phe Ala Lys 


Asp 










245 










250 




255 


Leu 


Leu 


Asp 


Gin 


Thr 


Arg 


Ser 


Ser 


Arg 


Glu 


Leu 


Glu He He 


Leu 










260 










265 






270 


Asn 


Tyr 


Arg 


Asp 


Asp 


Asn 


Ser 


Leu 


He 


Glu 


Glu 


Gin Ser Gly 


Asn 










275 










280 




285 


Asp 


Leu 


Ala 


Arg 


Leu 


Lys 


Leu 


Ala 


He 


Lys 


Tyr 


Arg Gin Lys 


Glu 










290 










295 




300 


Phe 


Val 


Ala 


Gin 


Pro 


Asn 


Cys 


Gin 


Gin 


Leu 


Leu 


Ala Ser Arg 


Trp 










305 










310 




315 


Tyr 


Asp 


Glu 


Phe 


Pro 


Gly 


Trp 


Arg 


Arg 


Arg 


His 


Trp Ala Val 


Lys 










J zU 










o o c 

325 






330 


Met 


Val 


Thr 


Cys 


Phe 


He 


He 


Gly 


Leu 


Leu 


Phe 


Pro Val Phe 


Ser 










335 










340 






345 


Val 


Cys 


Tyr 


Leu 


He 


Ala 


Pro 


Lys 


Ser 


Pro 


Leu 


Gly Leu Phe 


He 










350 










355 






360 


Arg 


Lys 


Pro 


Phe 


He 


Lys 


Phe 


He 


Cys 


His 


Thr 


Ala Ser Tyr 


Leu 










365 










370 






375 


Thr 


Phe 


Leu 


Phe 


Leu 


Leu 


Leu 


Leu 


Ala 


Ser 


Gin 


His He Asp 


Arg 










380 










385 




390 


Ser 


Asp 


Leu 


Asn 


Arg 


Gin 


Gly 


Pro 


Pro 


Pro 


Thr 


He Val Glu 


Trp 










395 










400 






405 


Met 


He 


Leu 


Pro 


Trp 


Val 


Leu 


Gly 


Phe 


He 


Trp 


Gly Glu He 


Lys 










410 










415 






420 


Gin 


Met 


Trp 


Asp 


Gly 


Gly 


Leu 


Gin 


Asp 


Tyr 


He 


His Asp Trp 


Trp 










425 










430 






435 


Asn 


Leu 


Met 


Asp 


Phe 


Val 


Met 


Asn 


Ser 


Leu 


Tyr 


Leu Ala Thr 


He 



61/69 



WO 01/62922 



PCT/US01/05896 



440 445 450 



Ser 


Leu 


Lys lie 


Val 


Ala 


Phe 


Val 


Lys 


Tyr 


Ser 


Ala Leu 


Asn 


Pro 






455 










460 








465 


Arg 


Glu 


Ser Trp 


Asp 


Met 


Trp 


His 


Pro 


Thr 


Leu 


Val Ala 


Glu 


Ala 








470 










475 








480 


Leu 


TVU 

Pne 


Ala lie 


Ala 


Asn 


He 


Pne 


Ser 


Ser 


Leu 


Arg Leu 


He 


Ser 








485 










490 






495 


Leu 


Phe 


Thr Ala 


Asn 


Ser 


His 


Leu 


Gly 


Pro 


Leu 


Gin He 


Ser 


Leu 








500 








505 








510 


Gly 


Arg 


Met Leu 


Leu 


Asp 


He 


Leu 


Lys 


Phe 


Leu 


Phe He 


Tyr 


Cys 








515 










520 








525 


Leu 


Val 


Leu Leu 


Ala 


Phe 


Ala 


Asn 


Gly 


Leu 


Asn 


Gin Leu 


Tyr 


Phe 








530 








535 






540 


Tyx 


Tyr 


UslU vjIU 


Thr 


Lys 


Criy 


Leu 


rrru 

Thr 


Cys 


Lys 


pi,. Tl rt 


Arg 


Cys 








545 










550 








555 


GlU 


Lys 


Gin Asn 


Asn 


Ala 


Phe 


Ser 


Thr 


Leu 


Phe 


Glu Thr 


Leu Gin 






560 










565 








570 


Ser 


Leu 


Phe Trp 


Ser 


He 


Phe 


Gly 


Leu 


He 


Asn 


Leu Tyr 


Val 


Thr 








575 








580 






585 


Asn 


Val 


Lys Ala 


Gin 


His 


Glu 


Phe 


Thr 


Glu 


Phe 


Val Gly 


Ala 


Thr 








590 










595 








600 


Leu 


Phe 


Gly Asp 


He 


Thr 


Met 


Ser 


Ser 


Leu 


Trp 


Leu Phe 


Tyx 


Ser 



605 610 615 

Thr Cys 
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Gly 


Ala 


His 


Ala 


Lys 


Thr 


Gly 


He 


Gin 


He 


Gly 


Met Leu 


Ser 


Thr 


1 








5 










10 








15 


Gly 


Lys 


Asp 


Arg 


Ser 


Leu 


Arg 


Val 


Thr 


Gly 


Met 


Thr Trp 


Arg 


Ser 










20 










25 








30 


Ser 


Tyr 


Val 


Pro 


Val 


Ser 


Ala 


Pro 


Pro 


Pro 


Asn 


Ser Ser 


Glu 


Gin 










35 










40 








45 


Tyr 


Ser 


Ser 


Gly 


Ala 


Gin 


Ser 


He 


Pro 


Ser 


Thr 


Val Thr 


Val 


He 










50 










55 








60 


Ala 


Pro 


Trp 


Ser 


Pro 


Thr 


Leu 


Glu 


Asn 


Thr 


Thr 


Trp Glu 


Leu 


Val 










65 










70 






75 


Leu 


Leu 


Leu 


Leu 


Lys 


He 


He 


Ser 


Ser 


Ser 


Asn 


Ser Phe 


Gly Arg 










80 










85 








90 


Asn 


Leu 


Pro 


Pro 


Lys 


Arg 


Arg 


Cys 


Arg 


Asp 


Tyr 


Asp Glu 


Arg Gly 










95 










100 








105 


Phe 


Cys 


Val 


Leu 


Gly 


Asp 


Leu 


Cys 


Gin 


Phe 


Asp 


His Gly 


Asn Asp 










110 










115 








120 


Pro 


Leu 


Val 


Val 


Asp 


Glu 


Val 


Ala 


Leu 


Pro 


Ser 


Met He 


Pro 


Phe 










125 










13 0 








135 


Pro 


Pro 


Pro 


Pro 


Pro 


Gly 


Leu 


Pro 


Pro 


Pro 


Thr 


Thr Pro 


Gly Met 










140 










145 








150 


Leu 


Met 


Pro 


Pro 


Met 


Pro 


Gly 


Pro 


Gly 


Pro 


Gly 


Pro Gly 


Pro 


Gly 










155 










160 








165 


Pro 


Gly 


Pro 


Gly 


Pro 


Gly 


Pro 


Gly 


Pro 


Gly 


Pro 


Gly His 


Ser 


Met 










170 










175 








180 


Arg 


Leu 


Pro 


Val 


Pro 


Gin 


Gly 


His 


Gly 


Gin 


Pro 


Pro Pro 


Ser 


Val 










185 










190 








195 


Val 


Leu 


Pro 


He 


Pro 


Arg 


Pro 


Pro 


He 


Thr 


Gin 


Ser Ser 


Leu 


lie 










200 










205 








210 


Asn 


Ser 


Arg 


Asp 


Gin 


Pro 


Gly 


Thr 


Ser 


Ala 


Val 


Pro Asn 


Leu 


Ala 










215 








220 








225 


Ser 


Val 


Gly 


Thr 


Arg 


Leu 


Pro 


Pro 


Pro 


Leu 


Pro 


Gin Asn 


Leu 


Leu 
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230 235 240 

Tyr Thr Val Ser Glu Arg Gin Pro Met Tyr Ser Arg Glu His Gly 

245 250 255 

Ala Ala Ala Ser Glu Arg Leu Gin Leu Gly Thr Pro Pro Pro Leu 

260 265 270 

Leu Ala Ala Arg Leu Val Pro Pro Arg Asn Leu Met Gly Ser Ser 

275 280 285 

lie Gly Tyr His Thr Ser Val Ser 

290 
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Leu 


Ser 


Pro Asp 


Arg 


Leu 


Leu 


Val Leu 


Pro 


Asp 


Asn Tyr Ser 


His 


1 






5 








10 


15 


Phe 


Ser 


Gin Ala 


Ser 


Ala Asn 


Leu Gin 


Gly 


Pro 


Ser Arg Thr 


Thr 








20 








25 






30 


Glu 


Leu 


Phe His 


Pro 


Thr 


Leu 


Ala Ser 


lie 


Ser 


Ser Pro Met 


Leu 








35 








40 






45 


Glu 


Gly 


Ala Glu 


Leu 


Tyr Phe 


Asn Val 


Asp 


His 


Gly Tyr Leu 


Glu 








50 








55 






60 


Gly 


Leu 


Val Arg 


Gly 


Cys 


Lys 


Ala Ser 


Leu 


Leu 


Thr Gin Gin 


Asp 








65 








70 






75 


Tyr 


He 


Asn Leu 


Val 


Gin 


Cys 


Glu Thr 


Leu 


Glu 


Ala Pro Phe 


Phe 








80 






85 






90 


Gin 


Asp 


Cys Met 


Ser 


Glu 


Asn 


Ala Leu 


Asp 


Glu 


Leu Asn He 


Glu 








95 








100 






105 


Leu 


Leu 


Arg Asn 


Lys 


Leu 


Tyr 


Lys Ser 


Tyr 


Leu 


Glu Ala Phe 


Tyr 








110 








115 






120 


Lys 


Phe 


Cys Lys 


Asn 


His 


Gly 


Asp Val 


Thr 


Ala 


Glu Val Met 


Cys 








125 








130 






135 


Pro 


He 


Leu Glu 


Phe 


Glu 


Ala 


Asp Arg 


Arg 


Ala 


Phe He He 


Thr 








140 








145 






150 


Leu 


Asn 


Ser Phe 


Gly 


Thr 


Glu 


Leu Ser 


Lys 


Glu 


Asp Arg Glu 


Thr 








155 








160 




165 


Leu 


Tyr 


Pro Thr 


Phe 


Arg Gin 


Leu Tyr 


Pro 


Glu 


Gly Leu Arg 


Leu 








170 








175 




180 


Leu 


Ala 


Gin Ala 


Glu 


Asp Phe 


Asp Gin 


Met 


Lys 


Asn Val Ala 


Asp 








185 








190 






195 


His 


Tyr 


Gly Val 


Tyr 


Lys 


Pro 


Leu Phe 


Glu 


Ala 


Val Gly Gly 


Ser 








200 








205 






210 


Gly 


Gly 


Lys Thr 


Leu 


Glu Asp 


Val Phe 


Tyr 


Glu 


Arg Glu Val 


Gin 
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Leu 


Met 


Thr 






405 


Trp 


Asn 


Ser 






420 


Val 


Glu 


Ser 






435 


Asp 


Ser 


Val 






450 


Leu 


His 


Ser 






465 


Leu 


Glu 


Ala 






480 


Asp 


Leu 


Pro 






495 


He 


Asp 


His 






510 


Ser 


Cys 


Gin 






525 


Gin 


Cys 


Thr 






540 


Val 


Ser 


Ser 






555 


Asp 


Tyr 


Pro 






570 


Met 


Ser 


Glu 






585 


Gly 


Ala 


Thr 






600 


He 


Tyr 


Leu 






615 


Thr 


Gin 


Thr 






630 


Gin 


Arg 


Asn 






645 


Asp 


Ser 


Ala 






660 


Glu 


Asp 


He 






675 


Lys 


Ser 


He 






690 


Lys 


Ser 


He 






705 


Cys 


His 


Tyr 






720 


Gly 


Ser 


Asn 



PCT/US01/05896 



67/69 



WO 01/62922 



PCT/US01/05896 



725 730 735 

Asn Leu lie Leu He Leu Leu Glu Pro He Pro Gin Asn Ser He 

740 745 750 

Pro Asn Lys Tyr His Lys Leu Lys Ala Leu Met Thr Gin Arg Thr 

755 760 765 

Tyr Leu Gin Trp Pro Lys Glu Lys Ser Lys Arg Gly Ala Leu Leu 

770 775 780 

Gly 
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Trp His 


Lys He 


Ala 


Glu Thr Tyr Ser 


He 


Glu 


Met Gly Pro 


Arg 


1 




5 




10 




15 


Gly Pro 


Gin Cys 


GlU 


Gly Ala He Pro 


Thr 


His 


Leu Pro Ala 


Leu 






20 




25 






30 


Trp Arg 


Thr Pro 


Gin 


Asn Arg Pro Asn 


Ser 


Arg 


Ala Ser Lys 


Ala 






35 




40 




45 


Thr Ser 


Pro Thr 


Ser 


Ser His Pro Pro 


Met 


Leu 


Pro His Pro 


Ser 






50 




55 






60 


Thr Gly 


Ala Thr 


Asn 


Thr Leu Thr Gly 


Ser 


He 


Thr Arg Leu 


Leu 






65 


70 




75 


His Lys 


Phe Thr 


Val 


He Ser Val Pro 


His 


Leu 


Pro Glu Lys 


Gin 






80 




85 




90 


Ala Thr 


Gly Arg 


Phe 


Glu Glu Asp Phe 


He 


Glu 


Lys Arg Lys 


Arg 






95 




100 






105 


Arg Leu 


He Leu 


Trp 


Met Asp His Met 


Thr 


Ser 


His Pro Val 


Leu 






110 




115 






120 


Ser Gin 


Tyr Glu 


Gly 


Phe Gin His Phe 


Leu 


Ser 


Cys Leu Asp 


Asp 






125 




130 






135 


Lys Gin 


Trp Lys 


Met 


Gly Lys Arg Arg 


Ala 


Glu 


Lys Asp Glu 


Met 






140 




145 






150 


Val Gly 


Ala Ser 


Phe 


Leu Leu Thr Phe 


Gin 


He 


Pro Thr Glu 


His 






155 




160 






165 


Gin Asp 


Leu Gin 


Asp 


Val Glu Asp Arg 


Val 


Asp 


Thr Phe Lys 


Ala 






170 




175 




180 


Phe Ser 


Lys Lys 


Met 


Asp Asp Ser Val 


Leu 


Gin 


Leu Ser Thr 


Val 






185 




190 






195 


Ala Ser 


Glu Leu 


Val 


Arg Lys His Val 


Gly 


Gly 


Phe Pro Gin 


Gly 






200 




205 






210 


He Pro 


Glu Arg 


Trp 


Ala Val Pro Ser 


Arg 


Pro 


Ser Val He 


Pro 






215 




220 






225 


Ser Arg 


Trp Thr 


Pro 


Pro Phe Ala Leu 


Arg 


Pro 


Ser Thr Val 


Pro 






230 




235 






240 


Phe Leu 


Thr Arg 


Ala 


Val Pro Met Lys 


Pro 


Ser 


Gly Arg Cys 


Leu 






245 




250 




255 


Leu Ser 


Ser Pro 


Arg 


Met Thr Ser Ser 


Arg 


Cys 


Trp Thr His 


Cys 






260 




265 






270 


Leu Ser 


Thr Arg 


Ala 


Cys Ser Pro Thr 


Ser 


Leu 


Thr Ser Ser 


He 






275 




280 






285 


Tyr Lys 


Lys Ala 


Pro 


Ser Pro Arg 











290 
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Thr 


Pro 


Cys 


Leu 


Gin 


Glu 


Val Ala Gly 


Pro 


Leu Leu 


Asp Gly Met 


1 








5 




10 




15 


Val 


Tyr 


Ala 


Leu 


Gly 


Gly 


Met Gly Pro 


Asp 


Thr Ala 


Pro Gin Ala 










20 






25 




30 


Gin 


Val 


Arg 


Val 


Tyr 


Glu 


Pro Arg Arg 


Asp 


Cys Trp 


Leu Ser Leu 










35 






40 




45 


Pro 


Ser 


Met 


Pro 


Thr 


Pro 


Cys Tyr Gly 


Ala 


Ser Thr 


Phe Leu His 










50 






55 




60 


Gly 


Asn 


Lys 


lie 


Tyr 


Val 


Leu Gly Gly 


Arg 


Gin Gly 


Lys Leu Pro 










65 






70 




75 


Val 


Thr 


Ala 


Phe 


Glu 


Ala 


Phe Asp Leu 


Glu 


Ala Arg 


Thr Trp Thr 










80 




85 




90 


Arg 


His 


Pro 


Ser 


Leu 


Pro 


Ser Arg Arg 


Ala 


Phe Ala 


Gly Cys Ala 










95 






100 




105 


Met 


Ala 


Glu 


Gly 


Ser 


Val 


Phe Ser Leu 


Gly 


Gly Leu 


Gin Gin Pro 










110 






115 


120 


Gly 


Pro 


His 


Asn 


Phe 


Tyr 


Ser Arg Pro 


His 


Phe Val 


Asn Thr Val 










125 






13 0 




135 


Glu 


Met 


Phe 


Asp 


Leu 


Glu 


His Gly Ser 


Trp 


Thr Lys 


Leu Pro Arg 










140 






145 




150 


Ser 


Leu 


Arg 


Met 


Arg 


Asp 


Lys Arg Ala 


Asp 


Phe Val 


Val Gly Ser 










155 






160 




165 


Leu 


Gly 


Gly 


His 


He 


Val 


Ala He Gly 


Gly 


Leu Gly 


Asn Gin Pro 










170 






175 




180 


Cys 


Pro 


Leu 


Gly 


Ser 


Val 


Glu Ser Phe 


Ser 


Leu Ala 


Arg Arg Arg 










185 






190 




195 


Trp 


Glu 


Ala 


Leu 


Pro 


Ala 


Met Pro Thr 


Ala 


Arg Cys 


Ser Cys Ser 










200 






205 




210 


Ser 


Leu 


Gin 


Ala 


Gly 


Pro 


Arg Leu Phe 


Val 


He Gly 


Gly Val Ala 










215 






220 




225 


Gin 


Gly 


Pro 


Ser 


Gin 


Ala 


Val Glu Ala 


Leu 


Cys Leu 


Arg Asp Gly 










230 






235 




240 


Val 
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